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In  order  for  autonomous  systems  to  interact  with  their  environment  in  an  intelligent  way, 
they  must  be  given  the  ability  to  adapt  and  learn  incrementally  and  deliberately.  It  is 
virtually  impossible  to  devise  and  hand  code  all  potentially  relevant  domain  knowledge 
for  complex  dynamic  tasks.  This  thesis  describes  a  framework  to  acquire  domain  knowl¬ 
edge  for  planning  by  failure- driven  experimentation  with  the  environment.  The  initial 
domain  knowledge  in  the  system  is  an  approximate  model  for  planning  in  the  environ¬ 
ment,  defining  the  system’s  expectations.  The  framework  exploits  the  characteristics  of 
planning  domains  in  order  to  search  the  space  of  plausible  hypotheses  without  the  need 
for  additional  background  knowledge  to  build  causal  explanations  for  expectation  failures. 
Plans  are  executed  while  the  external  environment  is  monitored,  and  differences  between 
the  internal  state  and  external  observations  are  detected  by  various  methods  each  corre¬ 
lated  with  a  typical  cause  for  the  expectation  failure.  The  methods  also  construct  a  set 
of  concrete  hypotheses  to  repair  the  knowledge  deficit.  After  being  heuristically  filtered, 
each  hypothesis  is  tested  in  turn  with  an  experiment.  After  the  experiment  is  designed, 
a  plan  is  constructed  to  achieve  the  situation  required  to  carry  out  the  experiment.  The 
experiment  plan  must  meet  constraints  such  as  minimizing  plan  length  and  negative 
interference  with  the  main  goals.  The  thesis  describes  a  set  of  domain-independent  con¬ 
straints  for  experiments  and  their  incorporation  in  the  planning  search  space.  After  the 
execution  of  the  plan  and  the  experiment,  observations  are  collected  to  conclude  if  the 
experiment  was  successful  or  not.  Upon  succe.ss,  the  hypothesis  is  confirmed  and  the 
domain  knowledge  is  adjusted.  Upon  failure,  the  experimentation  process  is  iterated  on 
the  remaining  hypotheses  until  success  or  until  no  more  hypotheses  are  left  to  be  consid¬ 
ered.  This  framework  has  shown  to  be  an  effective  way  to  address  incomplete  planning 
knowledge  and  is  demonstrated  in  a  system  called  EXPO,  implemented  on  the  PRODIGY 
planning  architecture.  The  effectiveness  and  efficiency  of  EXPO’s  methods  is  empirically 
demonstrated  in  several  domains,  including  a  large-scale  process  planning  task,  where 
the  planner  can  recover  from  situations  missing  up  to  50%  of  domain  knowledge  through 
repeated  experimentation. 
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Chapter  1 
Introduction 


Learning  has  proven  to  be  a  vital  ingredient  in  transforming  planners  from  research 
tools  into  real  world  applications.  Of  foremost  concern  has  been  the  area  of  improv¬ 
ing  the  efficiency  of  planning.  The  learning  techniques  that  have  been  applied  range 
from  macrooperator  learning  [Fikes  et  ai,  1972;  Korf,  1985]  and  acquisition  of  control 
knowledge  for  guiding  search  [Mitchell  et  ai,  1983;  Minton,  1988;  Etzioni,  1990],  to  the 
synthesis  of  abstraction  hierarchies  [Sacerdoti,  1974;  Christensen,  1991;  Knoblock,  1991]. 
These  techniques  fall  under  the  rubric  of  speed-up  learning,  and  they  share  the  property 
of  acquiring  more  effective  ways  of  expressing  the  knowledge  that  the  system  already  im¬ 
plicitly  has.  After  learning,  a  planner  solves  more  efficiently  the  same  kinds  of  problems 
that  it  is  able  to  solve  before  learning.  In  other  words,  it  is  able  to  solve  more  problems 
within  a  given  time  bound.  This  type  of  learning  is  also  known  as  symbol-level  learning 
[Dietterich,  1986]. 

But  learning  is  also  necessary  in  other  dimensions  of  planning  systems.  The  repre¬ 
sentation  given  to  the  planner  is  bound  to  contain  many  inaccuracies,  which  may  be 
corrected  automatically  through  a  learning  cycle.  Human  planners  in  any  sizable  domain 
(e.g.  factory  production  planning,  routing  in  transportation  planning,  configuration  in 
telecommunication  networks,  and  so  on)  rarely  make  the  cissumption  that  they  have 
omniscient  world  knowledge.  A  much  more  realistic  assumption  is  that  given  domain 
knowledge  is  operationally  accurate  and  complete,  but  there  is  a  recovery  procedure  to 
acquire  more  knowledge  or  correct  existing  knowledge  if  and  when  this  assumption  is 
violated.  Learning  h«is,  in  this  case,  a  different  meaning.  The  new  knowledge  will  enable 
the  planner  to  solve  problems  that  it  was  not  capable  of  solving  before  learning  no  matter 
what  the  time  bound  was.  As  Newell  describes  this  situation  [Newell,  1982]: 

”...  When  we  say  ...  that  a  program  “can’t  do  action  A,  because  it  doesn’t 

have  knowledge  A'”,  we  mean  that  no  amount  of  processing  by  the  processes 
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(a)  Traditional  knowledge  acquisition  and  refinement 
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(b)  Automated  knowledge  refinement  and  incremental  acquisition  by  experimentation 

Figure  1.1:  Knowledge  acquisition  and  refinement  using  experimentation.  An  initial 
knowledge  base  is  obtained  from  the  domain  experts,  but  is  refined  autonomously  through 
direct  interactions  with  the  external  world. 

now  in  the  program  on  the  data  structures  now  in  the  program  can  yield  the 
selection  of  A." 

A  qualitative  augmentation  of  the  knowledge  available  to  a  planner  goes  beyond  the 
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reformulation  of  its  initial  knowledge  This  type  of  learning  is  known  as  learning  at  the 
knowledge  level  [Dietterich,  1986].  This  area  has  received  less  attention  from  the  plan¬ 
ning  and  learning  communities,  but  it  is  of  major  importance  for  building  autonomous 
intelligent  systems. 

Augmenting  incomplete  models  benefits  planning  in  three  different  ways.  First,  the 
coverage  is  expanded  because  a  planner  can  solve  more  problems  after  acquiring  the 
knowledge  needed.  Second,  the  prediction  accuracy  is  raised  since  learning  side  effects 
and  unusual  conditions  allow  for  planning  further  ahead.  Lastly,  the  ability  to  adapt 
provides  increased  autonomy  to  the  planner. 

Many  systems  for  guiding  the  acquisition  of  knowledge  can  be  found  in  the  literature 
(see  [Marcus,  1990]  for  an  overview).  Knowledge  acquisition  tools  provide  a  framework 
for  the  direct  interaction  of  knowledge  engineers  with  domain  experts.  The  resulting 
knowledge  base  is  an  approximate  model  of  the  domain,  whose  degree  of  correctness 
and  completeness  varies  with  the  complexity  of  the  task  domain.  The  knowledge  engi¬ 
neers  engage  in  test-and-revise  procedures  to  refine  the  knowledge  base  asymptotically 
until  a  satisfactory  model  is  obtained,  as  shown  in  Figure  1.1(a).  Our  work  is  concerned 
with  the  acquisition  of  knowledge  for  planning  domains.  None  of  the  current  knowledge 
acquisition  systems  are  designed  for  planning  domains  nor  do  they  emphasize  full  automa¬ 
tion.  Planning  systems  offer  the  possibility  of  direct  interaction  with  the  environment. 
The  autonomous  refinement  and  acquisition  of  knowledge  by  directed  experimentation 
is  invoked  once  an  initial  approximation  of  the  knowledge  base  is  available,  as  shown 
in  Figure  1.1(b).  Such  is  the  learning  model  presented  in  this  thesis:  a  failure-driven 
experimentation-based  method  for  incremental  acquisition  of  domain  knowledge.  In 
essence,  impasses  in  planning  or  divergences  between  internal  expectations  and  external 
observations  trigger  the  learning  procedure.  Learning  is  autonomous  and  unsupervised, 
the  interaction  with  the  environment  b^ing  the  only  source  of  additional  knowledge. 


1.1  Learning  By  Experimentation 

Figure  1.2  gives  an  overview  of  the  experimentation  process  described  in  this  thesis. 
Learning  is  triggered  when  one  of  the  actions  in  a  plan  heis  unexpected  consequences. 
The  first  step  is  to  come  up  .vith  a  set  of  hypotheses  that  might  explain  what  went  wrong. 
The  next  step  is  to  choose  a  hypothesis  from  that  set  and  devise  an  experiment  to  test 
it.  This  may  involve  creating  a  certain  state  of  affairs,  which  may  itself  require  planning 
to  set  up  the  experiment.  Similarly,  after  the  experiment  is  concluded  (successfully  or 
unsuccessfully),  planning  may  be  necessary  to  return  to  the  state  of  the  world  that  existed 
before  the  experiment.  These  stages  are  familiar  as  a  part  of  the  scientific  method.  But 
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Figure  1.2;  Experimentation  at  a  glance.  Failures  in  the  execution  of  a  plan  trigger 
learning.  A  general  cause  for  the  failure  is  hypothesized,  then  instantiated  to  a  particular 
hypothesis.  The  design  of  experiments  includes  planning  the  experimental  setup. 
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humans  demonstrate  in  their  everyday  life  that  experimentation  is  also  a  powerful  tool 
for  acquiring  knowledge  outside  of  the  laboratory. 

EXPO,  the  system  described  in  this  thesis,  automates  this  experimentation  process 
and  shows  that  it  is  a  useful  technique  for  acquiring  knowledge  for  planning  systems. 
Each  of  expo’s  stages  is  described  in  detail  in  the  succeeding  chapters.  To  illustrate 
some  of  the  issues  involved,  we  turn  now  to  an  example  of  how  people  use  experimentation 
to  autonomously  augment  their  knowledge  about  tne  world. 

Consider  the  problem  of  getting  ready  for  work  in  the  morning.  Given  adequate 
domain  knowledge,  we  can  eaisily  come  up  with  a  plan  to  achieve  this  goal.  Suppose  that 
one  of  the  subgoals  is  drying  one’s  hair.  One  possible  plan  to  achieve  this  goal  is:  get 
hair  dryer,  plug  in  hair  dryer,  turn  on  hair  dryer,  and  blow  hair  dry.  But  sometimes  our 
actions  may  not  yield  the  expected  results  when  executed  in  the  real  world.  For  example, 
suppose  that  one  day  the  hair  dryer  fails  to  function  when  we  turn  it  on.  At  this  point, 
we  have  two  alternatives.  One  is  to  find  another  person  (if  one  is  available)  and  ask  for 
an  explanation.  The  other  is  to  engage  in  experimentation  to  determine  the  cause  of 
the  failure.  The  advantage  of  experimentation  is  that  learning  is  done  autonomously,  an 
important  ability  of  human  beings  that  we  would  like  to  model  in  our  intelligent  systems. 

The  first  step  of  the  experimentation  process  is  to  generate  hypotheses  that  explain 
the  failure.  One  general  class  of  hypotheses  is  that  the  person’s  knowledge  about  the 
state  of  the  world  is  incorrect.  For  example,  three  particular  hypotheses  might  be: 

•  the  hair  dryer  is  broken 

•  the  outlet  is  broken 

•  the  hair  dryer  is  not  firmly  plugged  in 

Another  general  class  of  hypotheses  is  that  the  person’s  model  of  the  action  is  in¬ 
correct.  In  this  case,  we  look  for  conditions  under  which  we  are  currently  executing  the 
action  that  may  cause  the  failure.  For  example: 

•  Today  is  Saturday,  and  the  hair  dryer  does  not  work  on  Saturdays. 

•  It  is  noon,  and  the  hair  dryer  only  works  in  the  morning. 

•  The  light  switch  of  the  bathroom  is  off,  and  the  hair  dryer  works  only  when  the 
light  switch  is  on. 

•  The  bathroom  window  is  open,  and  the  hair  dryer  only  works  when  it  is  closed. 

•  The  door  is  open,  and  the  hair  dryer  only  works  when  it  is  closed. 
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We  may  construct  other  hypotheses,  but  let  us  restrict  this  discussion  to  the  ones 
above.  The  next  step  is  to  choose  a  hypothesis  and  then  design  and  perform  experiments 
that  prove  or  disprove  it.  Suppose  that  we  decide  to  check  first  that  the  state  of  the 
world  is  actually  what  we  believe. 

The  first  step  is  to  calibrate  the  hypotheses  and  choose  which  ones  to  look  at  first. 
The  first  one  (broken  hair  dryer)  seems  hard  to  test  for  someone  who  is  not  mechanically 
inclined.  So  we  decide  to  try  the  other  hypotheses  first.  To  test  if  the  hair  dryer  is  not 
firmly  plugged  in,  we  do  a  simple  experiment:  we  plug  the  hair  dryer  in  firmly  and  turn 
it  on  again.  This  does  not  make  the  hair  dryer  work,  so  the  hypothesis  is  rejected.  This 
experiment  was  quite  simple,  but  our  next  hypothesis  requires  a  more  elaborate  one. 
Suppose  that  the  outlet  is  broken.  One  possible  experiment  to  test  it  is  to  plug  another 
device  in  the  outlet.  To  do  so,  we  build  a  plan  with  the  following  steps:  unplug  hair 
dryer,  get  another  device  (maybe  an  old  hair  dryer),  plug  in  device.  This  plan  brings  a 
state  of  affairs  where  we  can  do  our  experiment:  to  turn  on  the  device  and  see  if  it  works. 
After  turning  it  on,  we  observe  the  results:  the  device  is  operating.  This  disproves  the 
hypothesis  that  the  outlet  is  broken,  and  we  move  to  consider  another  hypothesis.  But 
first,  we  need  to  go  back  to  the  state  of  affairs  before  the  experiment.  So  we  create  a 
plan  to  plug  the  hair  dryer  back  in:  turn  device  off,  unplug  device,  store  device  away, 
plug  in  hair  dryer. 

Now  we  are  ready  to  look  at  another  hypothesis.  For  example,  we  may  consider  now 
conditions  under  which  we  are  trying  to  make  the  hair  dryer  work,  which  is  our  second 
clciss  of  hypotheses.  Again,  we  calibrate  them  and  decide  which  ones  to  consider  first. 
Our  previous  experience  with  hair  dryers  helps  us  decide  which  hypotheses  are  more  likely 
to  be  relevant.  We  have  successfully  used  our  hair  dryer  at  various  times  and  on  different 
days  of  the  week,  so  the  first  two  hypotheses  are  ruled  out.  The  third  hypothesis  is  more 
plausible,  because  everytime  we  have  used  the  hair  dryer  before  the  lights  were  on  and 
they  are  off  now.  So  we  try  an  experiment.  We  build  a  plan  to  turn  on  the  bathroom 
lights,  and  then  we  turn  on  the  hair  dryer.  This  makes  it  work  because  the  light  switch 
controls  the  power  of  the  outlet.  So  we  conclude  that  the  hair  dryer  can  only  be  used  in 
this  bathroom  when  the  lights  are  on. 

In  summary,  with  this  type  of  experimentation  people  acquire  autonomously  knowl¬ 
edge  about  the  environment  that  is  necessary  for  solving  problems.  Notice  that  in  this 
example  in  particular,  but  also  in  general,  we  did  not  rely  upon  any  detailed  knowledge 
about  hair  dryers,  power  outlets,  or  light  switches  to  guide  the  experimentation  process. 
The  automation  of  this  experimentation  process  based  on  shallow  experiential  knowledge 
is  the  main  concern  of  this  thesis. 


1.2.  METHODOLOGY 
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1.2  Methodology 

The  aim  of  this  thesis  is  to  contribute  to  learning  at  the  knowledge  level  autonomously 
from  the  environment.  In  this  thesis,  a  complete  and  correct  domain  is  used  to  simulate 
the  real  world.  The  planner  is  given  an  incomplete  version  of  the  domain  that,  via  ex¬ 
perimentation,  it  attempts  to  flesh  out  incrementally  into  a  complete  model.  The  new 
knowledge  learned  by  experimentation  is  incorporated  into  the  domain  amd  immediately 
available  to  the  planner.  The  planner  in  turn  provides  a  performance  element  to  measure 
ainy  improvements  in  the  knowledge  base.  This  is  a  closed-loop  integration  of  planning 
and  learning  by  experimentation.  The  thesis  provides  a  theoretical  framework  for  this 
integration,  as  well  as  a  practical  demonstration  in  a  system  called  EXPO  and  its  inter¬ 
action  with  the  PRODIGY  planner  [Minton  et  ai,  1989a;  Carbonell  et  al.,  1991;  Carbonell 
et  al.,  1990]. 

The  planner  is  given  some  initial  knowledge  base  that  may  contain  a  number  of  im¬ 
perfections,  each  with  its  own  idiosyncr«icies.  Incorrect  facts  may  lead  to  contradictions. 
Lack  of  knowledge  limits  the  capabilities  of  the  planner.  This  thesis  concentrates  on  the 
refinement  of  knowledge  bases  that  are  initially  incomplete,  i.e.,  ignorant  of  facts  that 
are  true  and  needed  for  the  task  at  hand.  The  lack  of  information  to  solve  a  task  causes 
a  knowledge  impasse  that  triggers  learning.  We  do  not  address  curiosity-driven  explo¬ 
ration.  Learning  is  always  driven  by  the  need  to  accomplish  some  task.  One  way  to  solve 
knowledge  impeisses  is  by  directed  experimentation.  This  thesis  presents  methods  that 
set  context  for  systematic  experiments  (i.e.,  what  type  of  knowledge  is  missing,  where  it 
is  missing,...)  to  address  different  faults  of  the  domain  knowledge.  These  methods  are 
domain  independent,  yet  they  are  shown  to  be  very  effective  through  empirical  tests  of 

EXPO. 

Once  a  context  is  set  for  the  experimentation,  we  address  the  issue  of  the  design  of 
specific  experiments.  Not  all  experiments  are  equally  desirable.  Changing  one  variable  at 
a  time,  minimizing  interaction  with  the  environment,  minimizing  resource  consumption 
are  among  the  heuristics  typically  proposed.  This  thesis  shows  that  good  choices  can  be 
made  by  domain  independent  rules  that  can  be  used  to  define  experimentation  strategies. 

A  domain-independent  approach  is  certainly  desirable.  But  an  additional  aim  of  the 
thesis  is  to  rely  exclusively  on  the  knowledge  given  initially  for  planning.  This  means  that 
the  learning  occurs  even  when  no  causal,  structural,  or  common  sense  knowledge  (other 
than  the  one  embedded  in  the  domain  model)  is  available.  This  is  a  major  advantage, 
since  we  do  not  need  to  address  in  turn  the  acquisition  and  refinement  of  that  additional 
and  necessarily  complex  background  knowledge. 

Not  only  is  our  methodology  applicable  across  domains  and  independent  of  additional 
knowledge,  it  also  yields  efficient  learning.  This  is  shown  by  EXPO’s  empirical  results  in 
two  different  domains,  one  of  them  of  considerable  size  and  complexity. 
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1.3  The  Application  Domains 

The  methods  described  in  this  thesis  were  tested  and  evaluated  in  two  domains:  a  robot 
planning  domain  and  a  complex  process  planning  domain. 

The  robot  planning  domain  [Minton  et  ai,  1989b]  is  an  extension  of  the  one  used 
by  STRIPS  [Fikes  «ind  Nilsson,  1971]  that  has  been  used  in  other  research  [Minton, 
1988;  Etzioni,  1990;  Knoblock,  1991].  A  robot  can  push  and  carry  objects  to  move 
them  between  rooms.  Rooms  are  connected  through  doors,  that  may  be  opened,  closed, 
cind  locked  or  unlocked  with  the  appropiate  keys.  The  rooms  can  be  in  any  topological 
configuration  and  there  can  be  any  number  of  rooms,  doors,  keys,  and  boxes.  The  domain 
is  described  in  detail  in  Appendix  A. 

The  process  planning  domain  contains  a  large  body  of  knowledge  about  the  operations 
necessary  to  machine  and  finish  metal  parts  [Gil,  1991].  This  domain  was  chosen  because 
it  has  considerable  size  in  many  dimensions  (one  order  of  magnitude  bigger  than  most 
planning  domains  in  the  AI  literature),  which  makes  the  empirical  results  of  the  thesis 
more  definitive  and  scalable.  A  typical  problem  in  this  domain  is  to  produce  a  rectangular 
block  of  5”  X  2”  X  1”  made  of  aluminum  and  with  a  centered  hole  of  diameter  1/32” 
running  through  the  length  of  the  part.  To  perform  a  machining  operation  in  a  part,  the 
part  must  be  securely  held  by  some  holding  device.  Each  machine  uses  different  tools, 
and  the  appropriate  tool  for  the  operation  must  be  installed  in  the  machine.  Appendix 
B  can  be  consulted  for  more  detailed  information  on  this  formalization. 


1.4  Summary  of  Contributions 

The  contributions  of  this  thesis  are: 

•  A  domain-independent  method  to  acquire  domain  knowledge  for  planning 

•  Identification  of  important  issues  for  experimentation  in  planning 

•  Computationally  effective  methods  for  augmenting  incomplete  domain  knowledge 

•  Zero- knowledge  heuristics  for  finding  relevant  hypotheses 

•  Methodology  for  planning  efficient  experiments 

•  Full  implementation  integrated  in  PRODIGY  to  acquire  knowledge  effectively  in  two 
domains 

•  Empirical  validation  of  the  methods  via  the  PRODIGY  implementation  and  thorough 
testing 


1.5.  ORGANIZATION  OF  THE  THESIS 
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1.5  Organization  of  the  Thesis 

The  chapters  in  this  thesis  are  organized  as  follows. 

Chapter  2  presents  the  related  work.  This  chapter  includes  a  review  of  work  on  ex¬ 
perimentation,  repairing  plan  failures,  and  learning  from  the  environment.  However,  this 
thesis  presents  the  first  work  on  a  planner  that  learns  from  the  environment  using  so¬ 
phisticated  experimentation  techniques.  Chapter  2  also  discusses  research  on  less  directly 
related  areas  such  as  rule  induction  and  imperfect  theory  refinement. 

Chapter  3  provides  the  context  for  the  experimentation  system  described  in  the  suc¬ 
ceeding  chapters.  It  begins  by  describing  the  type  of  domain  knowledge  available  to 
a  planner  and  the  possible  imperfections  of  that  knowledge.  Acquiring  domain  knowl¬ 
edge  is  then  cast  in  terms  of  concept  learning,  a  well  understood  framework  in  which  an 
experimenter  can  be  described  as  a  learner  that  is  active  in  the  selection  of  examples. 
The  chapter  turns  next  to  how  a  planner  can  monitor  the  external  world  to  detect  plan 
execution  failures,  and  how  it  can  manipulate  the  external  world  through  experimenta¬ 
tion.  This  experimentation  serves  to  pinpoint  specific  imperfections  in  the  knowledge 
base — only  the  ones  responsible  for  plan  failures.  We  call  this  type  of  experimentation 
task-driven  experimentation  and  it  is  contrasted  with  other  types  of  experimentation 
in  the  chapter.  The  chapter  also  discusses  what  it  means  for  an  experimenter  to  be 
efficient.  It  finishes  with  a  description  of  the  PRODIGY  planner,  on  top  of  which  our 
experimentation  work  is  built. 

Chapter  4  describes  the  experimentation  process  as  implemented  in  the  EXPO  sys¬ 
tem.  This  process  involves  detecting  a  knowledge  impasse,  choosing  promising  hypotheses 
to  overcome  it  (which  EXPO  does  using  domain-independent  heuristics),  designing  ex¬ 
periments,  executing  them,  and  incorporating  newly  discovered  facts  into  the  planner’s 
knowledge  base.  The  chapter  describes  in  detail  how  new  preconditions  are  learned  by 

EXPO. 

Chapter  5  takes  a  broad  view  of  methods  for  learning  by  experimentation.  It  is  a 
comprehensive  survey  of  various  types  of  incompleteness  that  can  exist  in  a  planner’s 
knowledge  base.  For  each  type  of  incompleteness  it  describes  how  experimentation  tech¬ 
niques  can  be  used  to  locate  and  repair  faults. 

An  empirical  analysis  of  EXPO’s  performance  is  presented  in  Chapter  6.  Two  different 
types  of  tests  were  run.  In  the  first  case,  EXPO  is  shown  to  be  effective  in  that  the  planner 
is  able  to  solve  many  more  problems  after  learning — using  the  knowledge  acquired  by 
EXPO — than  it  could  solve  with  its  initial  knowledge.  Note  that  it  is  not  a  matter  of 
solving  the  problems  faster,  but  rather  a  matter  of  whether  the  problems  are  solvable  at 
all.  The  second  type  of  test  analyzes  how  efficient  EXPO  is  with  respect  to  the  number 
of  experiments  that  it  performs  and  the  amount  of  effort  required  to  perform  them. 
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Finally,  Chapter  7  presents  conclusions,  the  limitations  of  this  work,  and  suggests 
directions  for  future  research.  Two  appendices  follow;  they  describe  in  detail  the  appli¬ 
cation  domains  both  qualitatively  and  quantitatively. 


Chapter  2 
Related  Work 


This  chapter  presents  a  discussion  on  previous  work  related  to  this  thesis.  The  first  section 
reviews  the  topic  of  experimentation  in  the  AI  literature.  Work  on  concept  learning, 
both  theoretical  and  practical  suggests  that  active  learners  (ones  that  participate  in  the 
learning  process  by  asking  their  own  questions,  often  posed  as  experiments)  are  more 
powerful  than  passive  learners.  The  section  also  examines  experimentation  in  scientific 
discovery  systems.  Section  2.2  discusses  planning  systems  that  learn  by  experimentation 
and  planning  systems  that  learn  from  their  interaction  with  the  environment.  The  final 
section  reviews  some  relevant  work  on  theory  refinement  amd  rule  induction. 


2.1  Experimentation 

This  section  reviews  related  work  on  the  topic  of  experimentation.  The  work  is  divided 
here  into  two  areas;  concept  learning  and  scientific  discovery.  Section  2.2  contains  ref¬ 
erences  to  some  systems  that  use  experimentation  for  acquiring  domain  knowledge  for 
planning. 

2.1.1  Experimentation  in  Concept  Learning 

Active  learners  (ones  that  participate  in  the  learning  process  by  cisking  their  own  ques¬ 
tions)  that  have  the  ability  to  formulate  experiments  are  believed  to  be  much  more 
powerful  and  efficient  than  pcissive  learners  that  do  not  have  that  ability.  Results  in 
formal  learning  theory  show  that  finding  a  consistent  hypothesis  is  NP-hard  for  many 
classes  of  representations  of  concepts  [Pitt  and  Valiant,  1988;  Haussler,  1989].  These 
results  are  based  on  a  scheme  in  which  a  passive  learner  collects  instances  through  an 
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oracle  called  EXAMPLES.  The  learner  calls  the  oracle,  which  randomly  chooses  an  ex- 
«imple  along  with  its  classification  as  positive  or  negative.  The  use  of  this  oracle  may  be 
one  of  the  core  reasons  for  the  discouraging  results  that  have  been  obtained  [Haussler, 
1988].  In  fact,  humans  seem  to  be  more  effective  learners  than  the  results  show.  This 
may  well  be  because  the  oracle  EXAMPLES  involves  a  very  passive  attitude  on  the  part 
of  the  learner.  Research  on  other  types  of  oracles  shows  better  results  [Angluin,  1987). 
In  particular,  membership  oracles  accept  an  instance  as  an  input  and  return  its  classi¬ 
fication  (positive  or  negative).  This  type  of  oracle  resembles  more  realistic  set  ups  for 
learning.  Amsterdam  [1988]  proposes  an  oracle  called  EXPERIMENT,  that  accepts  a 
partial  description  of  an  example  and  returns  a  complete  description  (if  there  exists  any). 
EXPERIMENT  is  shown  to  be  more  powerful  than  EXAMPLES. 

If  the  learner  has  the  capability  to  choose  examples,  how  should  that  choice  be  influ¬ 
enced?  Again,  research  in  formal  learning  theory  has  tried  to  characterize  "good”  and 
"bad”  examples  [Rivest  and  Sloan,  1988;  Ling,  1991].  Learning  algorithms  are  faster 
with  good  examples,  and  learning  speed  degenerates  when  the  quality  of  the  examples 
decrecises. 

Factorization  of  concepts  into  independent  relations  seems  to  be  a  powerful  technique 
for  generating  discrimination  experiments  efficiently  in  version  spaces  [Subramanian  and 
Feigenbaum,  1986].  [Gross,  1988]  shows  that  selecting  examples  to  reduce  the  difference 
between  a  concept  description  and  its  curent  maximum  generalization  is  more  effective 
than  selecting  examples  at  random.  A  similar  experimentation  technique  is  used  in 
[Sammut  and  Banerji,  1986].  [Ruff  and  Dietterich,  1989]  presents  a  study  on  the  ef¬ 
fectiveness  of  experiments.  The  performance  of  several  experimentation  strategies  was 
tested  on  Boolean  function  learning.  The  results  show  that  the  ability  to  do  any  kind  of 
experimentation  dramatically  increases  performance.  Simple  but  clever  experimentation 
strategies  were  found  to  be  almost  as  effective  as  sophisticated  and  expensive  ones.  One 
could  argue  that  the  optimal  experimentation  strategy  is  one  that  would  generate  exam¬ 
ples  close  to  the  ones  that  a  good  teacher  would  [VanLehn,  1987;  Salzberg  et  al.,  1991], 
and  far  from  the  ones  that  a  non-cooperative  teacher  would  generate  [Dent  and  Schlim- 
mer,  1990].  However,  it  is  not  possible  to  generate  the  optimal  sequence  of  examples 
(experiments)  unless  the  concept  is  known  beforehand  and  the  appropriate  near-misses 
can  be  generated  [Winston,  1975]. 

What  do  these  results  in  experimental  and  formal  concept  learning  tell  us?  First, 
that  it  is  important  that  the  learner  be  active  in  the  learning  process.  This  is  why  active, 
directed  experimentation  is  a  very  promising  approach  for  learning.  Second,  that  the 
nature  of  the  examples  greatly  influences  the  speed  of  the  learning:  good  examples  make 
learning  faster.  In  other  words,  good  experiments  make  learning  faster. 


2.1.  EXPERIMENTATION 
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2.1.2  Experimentation  in  Scientific  Discovery  Systems 

Experimentation  is  a  vital  component  of  science.  Most  scientific  discovery  programs 
use  the  results  of  experiments  to  formulate  quantitative  or  qualitative  laws,  such  as 
[Langley  et  ai,  1987;  Falkenheiner  and  Michalski,  1986;  Nordhausen  and  Langley,  1992]. 
It  is  the  user  who  designs  the  experiments,  executes  them,  and  provides  the  system 
with  the  results.  Recently  there  has  been  an  increasing  interest  on  modeling  scientific 
experimentation  in  recent  programs. 

COAST 

Explanation-based  theory  revision  [Rajamoney,  1988]  is  a  method  that  uses  experi¬ 
mentation  to  augment  and  correct  theories.  It  is  demonstrated  in  COAST,  a  system  that 
revises  qualitative  theories  of  physical  world  processes,  like  evaporation  and  osmosis. 
COAST  detects  a  fault  in  the  theory  when  (1)  an  observation  cannot  be  explained,  (2) 
the  predictions  contradict  the  observations,  and  (3)  multiple  explanations  can  be  built 
for  a  given  observation.  Then,  it  uses  a  set  of  theory  revision  operators  together  with 
constraints  (like  the  type  of  failure,  the  situation  in  which  it  happened,  etc)  to  produce 
a  set  of  revised  theories.  These  theories  can  be  tested  together  by  building  abstract 
hypotheses  that  cover  a  number  of  them.  The  abstract  hypotheses  are  used  to  build  an 
explanation  for  the  failure.  The  hypotheses  are  then  tested  through  experimentation  or 
through  previous  observations.  From  all  the  revised  theories  that  pass  the  test,  one  is 
selected  based  on  simplicity  and  predictive  power. 

Let  us  take  a  closer  look  at  what  is  called  in  COAST  experimentation-based  hypothesis 
refutation.  First,  the  hypothesis  is  used  to  create  a  prediction  that  specifies  the  values  of 
variables  that  agree  with  the  theory.  Then  experiments  are  designed  that  determine  the 
experimental  values  of  those  variables.  COAST  inplements  three  strategies  for  designing 
experiments.  Elaboration  selects  a  variable  to  be  meeisured  according  to  the  ease  of  the 
me«isurement.  Discrimination  prefers  variables  whose  predicted  values  are  different  for 
different  hypotheses.  Finally,  the  transformation  strategy  produces  totally  new  setups 
for  doing  experiments  when  the  possibilities  of  the  current  one  have  been  thoroughly 
exhausted.  A  more  detailed  study  of  discrimination  and  transformation  experiments  is 
presented  in  [Rajamoney,  1992].  [Falkenheiner  and  Rajamoney,  1988]  presents  a  method 
for  combining  experimentation-based  theory  revision  with  analogical  reasoning. 

In  brief,  COAST  uses  experiments  to  test  revisions  of  theories  about  physical  pro¬ 
cesses,  and  relies  heavily  on  the  ability  of  tiiose  theories  to  produce  explanations.  The 
design  of  experiments  involves  choosing  which  variables  to  observe  and  which  values  they 
take  under  the  theory  being  tested.  In  contr^lst,  EXPO  does  not  try  to  learn  about 
how  processes  evolve  in  the  physical  world.  Rather  the  domain  knowledge  models  the 
conditions  and  effects  of  actions  over  which  the  planner  has  control.  EXPO  does  not 
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have  access  to  a  theory  that  produces  explanations  for  failures  eis  COAST  does,  since  its 
only  available  knowledge  is  the  domain  operators  for  planning  and  they  are  incapable 
of  producing  such  explanations.  EXPO’s  hypotheses  are  produced  without  looking  at 
the  semantics  of  a  failure  encountered.  While  explanations  are  powerful,  we  wanted 
to  investigate  the  potential  of  a  theoryless  system,  which  turns  out  to  have  impressive 
performance. 

KEKADA 

KEKADA  [Kulkarni,  1988]  implements  a  set  of  experimentation  strategies  that  model 
scientists  at  work.  It  simulates  the  discovery  of  the  ornithine  cycle  based  on  Hans  Krebs 
accounts. 

KEKADA’s  experimentation  strategies  are  implemented  as  heuristic  operators,  which 
are  grouped  into  categories  as  follows.  Problem  choosers  decide  which  problem  to  focus 
on.  Hypothesis  generators  create  hypotheses  about  the  problem  at  hand.  Then,  the  hy¬ 
pothesis  or  strategy  proposers  decide  which  hypothesis  to  concetrate  on  or  which  strategy 
to  use  to  work  on  the  problem.  The  experiment  proposers  design  experiments  based  on 
the  hypotheses.  Then,  expectation  setters  find  out  from  the  hypotheses  what  the  results 
of  the  experiments  axe  expected  to  be.  The  experimenters  carry  out  experiments.  Next, 
the  results  of  the  experiments  are  analyzed  by  the  hypothesis  and  confidence  modifiers, 
which  modify  the  hypotheses  and  the  confidences  about  them.  Finally,  if  the  expectations 
for  the  experiments  do  not  agree  with  the  observations  the  problem  generators  propose 
to  study  this  phenomenon.  When  there  is  more  than  one  alternative  in  any  of  the  above 
decisions,  decision  maker  heuristics  are  used  to  make  a  choice. 

There  are  several  strategies  available  to  the  strategy  proposers;  (1)  magnify  the  phe¬ 
nomenon  by  varying  the  values  of  variables  in  the  experiments,  (2)  divide  and  conquer 
to  isolate  subprocesses,  (3)  determine  the  scope  of  the  phenomenon  using  an  object  type 
hierarchy,  (4)  determine  which  factors  are  necessary  for  the  phenomenon  to  occur,  (5)  to 
relate  the  phenomenon  to  another  one,  (6)  to  gather  more  data  about  the  phenomenon 
systematically,  and  (7)  domain-dependent  specialization  of  general  strategies  like  con¬ 
trolled  experimentation. 

An  experiment  in  KEKADA  is  specified  by  the  following;  the  inputs,  the  conditions 
and  the  place  for  carrying  out,  the  initial  quantities  of  the  inputs,  and  the  observations  to 
be  collected  after  the  experiment  is  carried  out.  The  expectation  setters  form  expectations 
for  an  experiment  that  consist  of  the  expected  output  substances  and  the  lower  and  upper 
bounds  on  the  quantities  and  rates  of  those  substances. 

Thus,  KEKADA’s  specifications  of  experiments  are  domain  specific.  KEKADA  is 
given  domain-dependent  knowledge  about  substances,  chemical  reactions,  and  other  peo¬ 
ple’s  experiments  on  urea  synthesis  that  Krebs  was  aware  of.  About  half  of  the  heuristics 
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in  KEKADA  are  domain  dependent,  although  they  can  be  used  for  other  biochemistry 
applications.  Most  of  KEKADA ’s  domain-independent  heuristics  are  used  by  EXPO,  as 
discussed  in  Section  4.6. 

STERN 

STERN  [Cheng,  1990]  is  a  scientific  discovery  system  that  models  experiments  using 
Galileo’s  work  on  free  fall.  In  STERN,  hypotheses  are  expressed  as  equations.  Experi¬ 
ments  are  used  to  (dis)confirm  hypotheses  and  to  generate  new  hypotheses. 

Experiments  are  designed  at  three  levels  of  abstraction.  At  the  most  abstract  level,  an 
experimental  pajadigm  is  chosen  such  cis  pendulums  or  inclined  planes.  At  the  next  level, 
an  experimental  setup  is  chosen,  which  is  a  particular  instantiation  of  the  experimental 
paradigm.  At  this  point,  a  particular  inclined  plane  with  concrete  values  for  physical 
dimensions  such  as  length,  inclination  angle,  and  height  would  be  chosen.  At  the  last, 
most  concrete  level,  an  experimental  test  is  chosen.  For  example,  we  may  choose  to  look 
at  how  the  distance  down  an  inclined  plane  varies  with  time. 

The  parameters  involved  in  the  experiment  are  classified  as  follows.  One  is  chosen 
to  be  the  output,  another  one  manipulable,  and  the  rest  are  considered  constant.  The 
constants  are  always  set  to  the  midpoint  of  their  range,  and  the  manipulable  variable 
is  given  values  within  its  range.  The  purpose  of  the  experiments  is  to  find  out  how  the 
output  variable  is  related  to  the  manipulable  variable  with  the  other  values  held  constant. 

STERN  uses  two  types  of  knowledge  during  experiment  design.  Pragmatic  knowledge 
prefers  paradigms  with  experimental  setups  that  are  easier  to  manufacture.  For  example, 
distance  is  e<isier  to  manipulate  than  time.  Background  knowledge  eliminates  experimen¬ 
tal  setups  that  are  trivial.  For  example,  given  the  angle  of  inclination  of  a  plane  and  its 
length,  the  height  can  be  geometrically  deduced  without  need  for  experiments. 

STERN  htis  the  ability  to  design  new  experimental  paradigms  by  combining  existing 
ones,  such  as  an  inclined  plane  and  a  projectile.  This  is  necessary  when  it  is  not  tractable 
to  design  experiments  in  an  existing  paradigm  (for  example,  if  a  variable  cannot  be 
eliminated  from  an  equation). 

Because  the  experimentation  space  is  quite  large,  STERN  has  some  heuristics  to 
improve  its  performance.  The  practicality  of  each  paradigm,  based  on  the  number  of 
setups  and  the  ease  of  the  manufacture  of  the  setups,  is  used  to  activate  paradigms. 

expo’s  experiments  are  very  different  in  nature  from  those  of  a  scientific  discovery 
system  like  STERN.  STERN’s  hypotheses  are  equations,  i.e.,  mathematical  relations 
between  variables.  EXPO’s  hypotheses  in  our  hair  dryer  example  in  Section  1.1  are  a  set 
of  candidate  conditions,  i.e.,  predicates  (possibly  with  several  variables)  that  must  be  true 
for  the  action  to  work.  STERN  chooses  in  the  equation  an  output  variable,  a  manipulable 
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variable,  and  the  rest  are  kept  constant.  Then  it  gives  values  to  the  manipulable  variable 
and  the  constants,  and  observes  the  value  of  the  output  variable  after  the  execution  of  the 
experiment.  EXPO,  on  the  other  hand,  does  not  need  to  classify  the  variables  present 
in  the  candidate  conditions.  All  the  variables  in  the  conditions,  and  many  more,  are 
instcintiated  by  the  planner  when  it  is  invoked  to  achieve  the  state  in  which  to  perform 
the  experiment  (this  is  explained  in  detail  in  Chapter  4).  EXPO  has  many  variables  tr 
observe  after  the  experiment’s  execution,  which  correspond  to  all  the  known  effects  of 
the  action.  Also,  STERN  repeats  experiments  with  the  same  setup  and  different  values 
of  the  manipulable  variable.  EXPO,  on  the  other  hand,  designs  experiments  so  that  a 
hypothesis  is  disconfirmed  or  confirmed  after  each  one.  Both  EXPO  and  STERN  prefer 
experiments  that  are  easier  to  perform,  and  they  both  share  a  concern  for  the  efficiency 
of  the  experimentation  process. 

FAHRENHEIT 

FAHRENHEIT  [Zytkow  et  aL,  1990]  extends  BACON’s  ability  to  discover  quantita¬ 
tive  laws  from  numerical  data.  The  system  determines  not  just  the  regularities  of  the 
set  of  variables,  but  also  the  range  of  values  for  which  the  functional  relation  holds. 
FAHRENHEIT  makes  BACON  efficient  through  a  multi-level  search  strategy  by  chang¬ 
ing  the  order  in  which  variables  are  considered. 

FAHRENHEIT’S  experimentation  ability  greatly  extends  BACON.  It  automates  the 
experiments  and  data  collection  through  a  hardware  system  that  controls  some  equipment 
in  a  chemistry  lab.  The  experiments  are  designed  according  to  the  current  knowledge  of 
the  system. 

Unlike  FAHRENHEIT’S,  the  parameters  of  EXPO’s  experiments  can  be  nonumerical. 
FAHRENHEIT’S  techniques  could  be  used  by  EXPO  in  numerical  domains  where  the 
operators  were  applicable  for  certain  values  of  their  parameters  (we  discuss  this  in  more 
detail  in  Section  7.3.1).  FAHRENHEIT  is  given  a  physical  configuration  where  the 
experiments  are  to  be  carried  out.  The  experiments  differ  in  the  values  that  are  given  to 
the  controllable  parameters.  In  contrast,  EXPO  has  to  design  the  configuration  state  in 
which  the  experiment  can  be  carried  out,  and  build  a  plan  to  achieve  such  a  state.  Every 
experiment  is  different  in  nature  from  the  rest,  and  the  selection  of  designs  that  satisfy 
the  user’s  requirements  is  of  crucial  importance  for  EXPO. 


2.2  Planning  and  Learning  from  the  Environment 

As  we  mentioned  in  the  introduction,  there  is  considerable  interest  in  planning  systems 
that  acquire  control  knowledge  by  introspection  [Korf,  1985;  Sacerdoti,  1974;  Mitchell  et 
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ai,  1983;  Laird  et  a/.,  1986;  Minton,  1988;  Mostow  and  Bhatnagar,  1987;  Veloso,  1992]. 
All  these  systems  differ  from  EXPO  in  two  major  ways.  First,  they  only  learn  control 
knowledge  while  EXPO  concentrates  on  acquiring  factual  domain  knowledge.  EXPO  is 
learning  at  the  knowledge  level,  as  opposed  to  the  symbol  level  [Newell,  1982;  Dietterich, 
1986].  Second,  they  learn  by  introspection,  and  not  from  interaction  with  an  external 
environment  a.s  EXPO  does. 

LEX 

LEX  [Mitchell  et  ai,  1983]  is  a  system  that  has  some  experimentation  capabilities 
to  learn  control  knowledge  by  introspection  in  the  domain  of  symbolic  integration.  The 
left-hand  side  of  its  heuristics  are  represented  as  version  spaces. 

LEX  is  composed  of  four  modules.  The  problem  generator  proposes  a  new  problem 
to  solve.  The  problem  solver  searches  for  a  solution  to  the  proposed  problem  using  the 
currently  available  heuristics.  Next,  the  critic  examines  the  solution  trace  and  assigns 
credit  to  search  steps  leading  towards  or  away  from  a  solution.  Each  step  may  be  classified 
as  a  positive  or  negative  instance  of  one  of  the  heuristics.  Then,  the  fourth  module,  the 
generalizer,  comes  into  play.  It  updates  the  version  space  that  corresponds  to  the  heuristic 
of  each  positive  and  negative  instance.  Then,  the  problem  generator  looks  at  the  new 
definitions  of  the  heuristics  and  proposes  new  problems  to  experiment  with.  LEX  then 
enters  a  new  generate-solve-critic-generalize  cycle. 

The  problem  generator  is  the  module  responsible  for  generating  experiments.  It 
prefers  problems  that  can  be  solved  with  the  current  operators  and  heuristics,  and  prob¬ 
lems  whose  solutions  will  provide  informative  instances.  One  way  for  a  problem  to  be 
informative  is  to  produce  instances  of  existing  partially  learned  heuristics.  Problems  of 
this  kind  are  generated  by  choosing  a  partially  learned  heuristic,  and  creating  a  problem 
that  matches  some  but  not  all  the  members  of  the  version  space  of  that  heuristic.  LEX 
does  so  by  using  a  hierarchy  of  the  types  of  mathematical  functions  that  it  can  use  in 
the  problems.  Another  way  in  which  a  problem  can  be  informative  is  that  it  may  lead 
to  create  a  new  heuristic.  Problems  of  this  type  are  problems  in  which  two  operators  are 
applicable  but  there  is  no  current  heuristic  to  recommend  which  operator  to  prefer. 

LEX  uses  experimentation  to  acquire  the  left-hand  side  of  control  rules,  while  EXPO’s 
intent  is  operator  refinement.  Additionally,  LEX  instantiates  functions  to  create  problems 
through  a  type  of  hi  .rarchy.  EXPO,  on  the  other  hand,  has  to  design  goal  states  with 
several  predicates,  and  is  concerned  with  the  actual  planning  for  achieving  such  goal 
states  and  the  interaction  of  this  planning  with  the  main  problem  at  hand. 

CHEF 

CHEF  [Hammond,  1986]  is  a  system  that,  like  EXPO,  learns  from  plan  execution 
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failures.  CHEF  is  a  case-based  planner  for  the  domain  of  Szechuan  cooking. 

CHEF  has  a  memory  of  plans  that  are  recipes,  and  it  uses  them  to  create  new  ones. 
For  example,  suppose  we  want  a  recipe  for  beef  with  broccoli.  CHEF  retrieves  a  plan 
from  its  memory,  say  beef  and  green  beans,  and  adapts  it  to  meet  the  goals  of  the  current 
problems.  In  this  case,  it  would  add  a  step  to  chop  the  broccoli.  After  coming  up  with 
a  plan,  CHEF  simulates  its  execution  in  the  real  world.  If  the  result  of  the  simulation 
does  not  satisfy  the  goals  of  the  problem,  an  expectation  failure  has  been  found.  In  this 
example,  the  simulator  indicates  that  the  broccoli  is  soggy,  and  not  crisp  as  wanted.  The 
simulator  also  returns  an  explanation  of  the  failure:  that  the  beef  leaves  water  in  the 
pan,  and  that  water  makes  broccoli  soggy.  (This  did  not  happen  in  the  original  recipe, 
because  green  beans  are  more  sturdy).  CHEF  uses  this  explanation  to  repair  the  plan, 
adding  an  extra  step  to  cook  the  broccoli  first  and  then  the  beef.  The  new  plan  is  stored 
in  memory,  indexed  by  the  causes  of  the  failure  contained  in  the  explanation. 

The  repair  used  in  a  plan  may  be  transferred  to  a  new  problem  that  may  have  the 
same  failure.  For  example,  if  CHEF  is  asked  for  a  recipe  for  chicken  and  snow  peas,  it 
remembers  the  broccoli  episode  and  anticipates  a  potential  problem  of  plans  that  cook 
the  chicken  and  the  snow  peas  at  the  same  time.  It  then  uses  the  beef  and  broccoli  recipe 
to  create  a  plan  that  avoids  the  same  failure. 

So  CHEF,  like  EXPO,  learns  to  avoid  plan  failures.  But  one  important  difference  is 
how  learning  is  done.  CHEF  calls  a  simulator  of  the  real  world  with  a  plan,  and  gets 
back  a  description  of  the  failures  of  the  plan  together  with  a  causal  explanation  for  the 
failures.  EXPO  uses  a  simulator  of  the  world  as  well,  but  it  monitors  the  simulation  step 
by  step  and  detects  local  failures  instead  of  being  informed  of  them.  EXPO  determines 
the  causes  of  failures  by  designing  and  executing  experiments.  It  is  not  told  about  the 
causal  chain  that  provokes  a  failure. 

CHEF  repairs  plans  that  cause  failures,  and  reuses  them  to  avoid  the  same  failure  in 
future  plans.  EXPO  learns  to  repair  operators  that  cause  failures,  and  uses  the  corrected 
operator  to  build  plans  that  will  not  incur  in  the  same  failure.  Thus,  the  granularity 
is  different.  This  has  to  do  with  the  fact  that  CHEF  is  a  case-based  planner,  while 
EXPO  is  designed  to  learn  domain  knowledge  for  generative  planners.  CHEF  learns  to 
avoid  cooking  some  vegetables  with  meats  that  sweat  water.  EXPO  would  learn  to  avoid 
cooking  some  vegetables  in  the  presence  of  water  (whichever  its  origins),  thus  covering  a 
larger  range  of  possible  failure  situations. 

LIVE 

LIVE  [Shen,  1989]  is  a  system  that  learns  from  its  environment.  LIVE  is  designed  for 
exploration  and  discovery.  It  can  formulate  new  operators  by  executing  actions  whose 
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conditions  and  effects  are  unknown.  It  can  also  formulate  new  terms  if  the  language  is 
insufficient.  EXPO  does  not  do  any  of  this. 

The  most  relevant  part  of  LIVE  is  its  method  for  refining  operators  by  splitting  exist¬ 
ing  ones.  When  the  expected  effects  of  an  operator  are  not  obtained  upon  its  execution 
(i.e.,  a  surprise  is  obtained),  the  operator’s  conditions  are  specialized  to  exclude  the  cur¬ 
rent  type  of  situation.  In  addition,  a  new  sibling  operator  is  created  with  the  existing 
operator’s  conditions  and  the  effects  ax:tuaily  obtained.  (This  method  is  similar  in  spirit 
to  learning  by  discrimination  [Langley,  1987].)  EXPO  on  the  other  hand  opts  for  learn¬ 
ing  only  the  specialized  operator  when  it  encounters  an  execution  failure.  The  sibling 
operator  is  in  practice  accounting  for  a  set  of  unexpected  unwanted  effects  which  does 
not  agree  with  a  task-directed  approach  like  EXPO’s. 

LIVE  uses  experimentation  to  revise  learned  rules  that  prove  to  be  too  specific  during 
planning.  The  experiment  consists  of  an  instantiation  of  the  rule’s  sibling  rule  that 
involves  applying  the  action  to  a  different  object  in  a  situation  that  has  not  been  seen 
before  (and  so  is  likely  to  produce  a  surprise).  LIVE  has  a  preference  for  experiments 
that  can  be  immediately  executed  in  the  current  state  or  in  easily  reachable  states. 
EXPO  designs  experiments  with  varied  conditions.  It  has  a  more  flexible  mechanism 
for  experiment  preferences,  one  that  takes  into  account  much  more  than  the  ease  of 
execution.  EXPO’s  domains  are  more  complex  in  size  than  LIVE’s  domains. 

CAP 

CAP  [Hume  «ind  Sammut,  1991]  is  a  system  that  uses  experiments  to  build  a  theory 
that  can  be  used  to  recognize  sequences  of  actions  performed  by  other  agents.  When  CAP 
observes  such  a  sequence,  it  divides  it  into  meaningful  subsequences  that  are  generalized 
using  inverse  resolution.  The  generalizations  are  tested  with  experiments. 

The  variables  of  an  experiment  are  instantiated  through  inverse  resolution,  which 
also  produces  changes  in  the  state  of  the  external  world  if  needed  for  the  experiment.  If 
the  experiment  is  successful,  then  the  action  description  is  generalized.  When  an  action 
cannot  be  used  because  a  condition  P  is  too  specific,  a  new  term  ~  P  is  created  and  a 
new  action  is  postulated  with  ~  P  as  a  condition. 

CAP,  like  EXPO,  does  some  pre-planning  for  experiment  setup.  However,  work  on 
CAP  up  to  date  has  not  addressed  the  choice  of  experiments  or  the  selection  of  pre¬ 
experiment  plans,  these  being  major  issues  for  the  design  of  EXPO.  CAP  detects  faults 
in  the  domain  theory  when  an  action  cannot  be  used  to  produce  a  proof.  EXPO,  on 
the  other  hand,  detects  faults  in  the  domain  theory  when  an  action’s  execution  fails 
that  Weis  believed  to  be  a  legal  step  of  the  plan.  CAP  refines  precondition  expressions 
by  generalizing  overly  specific  preconditions.  EXPO,  on  the  other  hand,  learns  new 
preconditions  and  also  new  effects  of  operators. 
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Soar 

The  Soar  architecture  acquires  control  knowledge  from  human  advice  [Laird  et  a/., 
1989;  Laird  and  Rosenbloom,  1990]  in  a  robotics  environment.  When  no  control  knowl¬ 
edge  is  available  to  select  an  operator,  Soar  either  makes  a  random  choice  or  prompts 
for  advice.  When  existing  control  knowledge  is  incorrect,  Soar  is  forced  to  reconsider 
each  decision  and  incorporate  human  advice.  This  advice  consists  both  of  recommen¬ 
dations  and  disrecommendations  of  operators.  In  contrast,  EXPO  concentrates  on  the 
acquisition  of  domain  knowledge  and  never  interacts  with  a  human  during  learning. 

Other  Work  on  Planning  and  Learning  from  the  Environment 

[Kedar  et  ai,  1991]  presents  a  system  that  refines  operators  by  building  causal  ex¬ 
planations  of  their  failures.  The  explanations  are  built  using  a  set  of  domain  constraints 
on  the  state  descriptions.  If  the  rejison  for  the  failure  is  a  contradiction  of  the  expecta¬ 
tion  and  the  domain  constraints,  then  the  difference  between  the  expected  and  observed 
states  is  explained.  The  result  of  an  explanation  is  a  new  precondition  for  the  opera¬ 
tor.  If  it  is  not  possible  to  build  an  explanation,  then  a  new  operator  with  a  variant 
outcome  (the  observed  effects)  is  created.  This  is  in  the  same  spirit  as  LIVE  and  dis¬ 
crimination  learning.  If  several  explanations  can  be  constructed,  then  there  are  several 
candidates  for  new  preconditions.  This  may  cause  complications  for  [Kedar  et  ai,  1991]. 
expo’s  experimentation  techniques  could  then  be  a  good  way  to  discriminate  amongst 
the  candidates. 

Other  systems  have  experimentation  capabilities  to  learn  from  real  robotic  environ¬ 
ments.  [Christiansen,  1992]  describes  empirical  learning  of  manipulations.  Almost  no 
prior  knowledge  is  assumed.  With  almost  no  initial  knowledge,  the  system  designs  experi¬ 
ments  by  giving  values  to  the  task  parameters,  performing  the  experiment,  and  clustering 
the  parameter  space  according  to  the  resulting  state.  The  system  demonstrates  two  ex¬ 
perimentation  techniques:  random  training,  and  strategic  self-training.  Random  training 
involves  a  random  choice  of  values  for  the  experiment  parameters.  Strategic  self-training 
explores  the  parameter  space  randomly  until  the  execution  of  the  action  does  not  unfold 
as  predicted.  Then,  a  similar  action  is  chosen  by  giving  the  parameter  a  new  value  chosen 
randomly  from  a  constant  interval  around  its  current  value.  Extensive  empirical  tests  in 
various  manipulation  tasks  show  that  strategic  self-training  yields  better  theories  than 
random  training.  Another  such  system  is  presented  in  [Gross,  1991].  Its  experimentation 
design  is  more  sophisticated,  in  that  it  is  able  to  vary  several  parameters  at  a  time.  The 
parameter  space  is  divided  into  regions.  Two  types  of  experiments,  generalization  and 
specialization,  reduce  uncertainty  surrounding  a  region  or  within  a  region  respectively. 
Each  type  of  experiments  is  designed  using  a  set  of  heuristics  that  decide  the  value  of  the 
parameter.  The  system  dynamically  defines  new  attributes,  a  very  desirable  capability 
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when  learning  from  the  environment.  Both  of  these  systems  assume  the  parameters  of 
the  experiments  to  be  numeric,  discrete,  and  ordered.  The  experiments  do  not  require 
any  planning  steps  for  setup,  the  action  can  be  immediately  executed.  EXPO’s  required 
experimentation  capabilities  do  not  assume  such  restrictions  in  the  parameter  values,  and 
produce  more  elaborate  setups.  However,  these  systems  are  able  to  deal  with  noise  in 
the  observations,  while  EXPO  is  not. 

Another  project  on  robots  that  learn  is  the  subsumption  architecture  [Brooks,  1986; 
Maes  and  Brooks,  1990;  Maes,  1991].  Actions  are  modeled  as  behaviors,  whose  conditions 
are  conjunctions  of  binary  perceptual  features.  The  robot  receives  binary  feedback  which 
it  uses  to  learn  when  to  activate  behaviors.  Each  behavior  monitors  the  values  of  the 
percepts  and  detects  their  correlation  with  the  feedback  received.  If  there  is  a  strong 
correlation  with  a  percept,  it  is  added  as  a  new  precondition  of  the  behavior.  Arbitration 
between  behaviors  is  also  achieved  by  tuning  a  network  to  the  current  goals.  There  is 
no  directed  experimentation  in  this  framework,  and  learning  takes  the  form  of  adaptive 
control.  Other  systems  that  learn  to  control  their  actions  with  this  type  of  trial-and-error 
learning  from  experience  are  reinforcement  learning  systems  [Sutton,  1990;  Kaelbling, 
1990;  Watkins,  1989;  Mahadevan  and  Connell,  1992].  These  systems  use  subsymbolic 
models  of  the  world.  In  contrast,  EXPO  has  explicit  descriptions  of  the  conditions  and 
the  expected  effects  of  actions. 

Many  planners  use  plan  repair  techniques  to  avoid  plan  failures  [Sussman,  1975;  Sac- 
erdoti,  1977;  Wilensky,  1983;  Wilkins,  1988].  Their  planning  algorithms  use  plan  mod¬ 
ification  strategies  to  solve  interactions  between  plan  steps  during  planning.  But  they 
assume  that  the  domain  knowledge  is  complete  and  correct.  EXPO,  on  the  other  hand, 
does  not  make  this  assumption.  It  is  given  a  plan  that  is  believed  to  work  based  on  the 
planner’s  expectations.  EXPO  can  be  surprised  if  it  finds  that  the  plan’s  execution  fails, 
because  of  wrong  expectations.  EXPO  concentrates  on  repairing  the  domain  knowledge 
(not  the  plan)  through  experimentation.  Armed  with  this  new  knowledge,  the  planner 
will  not  have  the  same  wrong  expectations  in  the  future. 


2.3  Theory  Refinement  and  Knowledge  Acquisition 


Theory  Refinement 

In  explanantion-based  learning  (EBL)  [Mitchell  et  a/.,  1986;  DeJong  and  Mooney, 
1986],  a  theory  composed  of  rules  is  used  to  build  an  explanation  that  justifies  why  an 
example  is  an  instance  of  the  concept  described  by  the  theory.  When  the  rules  contain 
errors,  no  explanation  may  be  constructed  for  some  examples  of  the  concept  and  (yet 
worse)  an  explanation  may  be  built  for  instances  that  are  not  examples  of  the  concept. 
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The  refinement  of  theories  for  EBL  has  been  a  major  focus  of  research,  addressing 
different  types  of  errors:  incompleteness  [Danyluk,  1991;  Sleeman  et  al.,  1990;  VanLehn, 
1987;  Genest  et  al.,  1990;  Mahadevan,  1989;  Kodratoff  and  Tecuci,  1991],  incorrectness 
[Ourston  and  Mooney,  1990;  Bylander  and  Weintraub,  1988],  intractability  [Tadepalli, 
1989;  Ellman,  1989;  Chien,  1990;  Flann,  1990],  or  combinations  of  these  types  of  er¬ 
rors  [Pa^zani,  1988;  Hall,  1988].  A  theory  is  incomplete  when  only  partial  explanations 
can  be  built  due  to  lack  of  information  in  the  theory.  The  above  mentioned  systems 
refine  the  theory  by  building  partial  explanations  and  completing  them  using  various 
techniques  including  inductive  methods  [Pazzani,  1988;  Danyluk,  1991],  analogical  rea¬ 
soning  [Falkenheiner,  1989;  Genest  et  al.,  1990],  apprentice-type  techniques  [Kodratoff 
and  Tecuci,  1991;  VanLehn,  1987],  and  experimentation  (see  the  COAST  system  in  Sec¬ 
tion  2.1.2.) 

Although  EXPO  is  also  designed  to  refine  incomplete  knowledge  it  acquires  both 
conditions  and  effects  of  actions,  which  is  quite  a  different  type  of  rule  than  EBL  rules. 
The  failures  obtained  from  executing  actions  are  very  different  from  explanation  failures. 
There  is  no  reason  to  believe  that  the  same  learning  paradigms  cannot  be  applied  to 
refine  incomplete  domain  knowledge,  although  this  is  an  open  issue. 

Knowledge  Acquisition 

Many  tools  have  been  designed  to  aid  in  the  engineering  of  knowledge  bases  (see 
[Marcus,  1990;  Boose,  1992]  for  good  overviews).  The  acquisition  of  knowledge  is  done 
through  interaction  with  a  human  expert.  EXPO,  on  the  other  hand,  is  given  an  initial 
knowledge  base  that  is  produced  by  the  expert,  and  is  able  to  acquire  knowledge  au¬ 
tonomously  in  domains  that  allow  direct  interaction  with  the  system  being  modeled  in 
the  knowledge  base. 


2.4  Other  Related  Work 

There  is  work  in  the  field  of  fault  diagnosis  on  violated  expectations  [Davis  et  al.,  1982; 
Genesereth,  1984].  Any  disagreements  between  the  expected  behavior  of  a  device  and 
its  actual  behavior  indicate  malfunctions  that  must  be  repaired.  In  this  literature,  the 
term  “failure”  is  used  in  a  different  sense  than  in  the  planning  literature:  faults  are 
misbehaviors,  and  failures  are  the  causes  of  faults.  Many  failures  of  a  device  may  be 
possible  causes  of  a  fault,  much  in  the  way  EXPO  must  consider  many  possible  domain 
adjustments  for  a  given  execution  failure.  However,  fault  diagnosis  systems  find  the 
causes  of  a  fault  by  building  a  causal  explanation  of  the  fault,  using  a  detailed  theory 
of  the  functionality  of  the  device  (often  a  qualitative  model)  [Davis,  1984;  Genesereth, 
1984;  Patil  et  al.,  1981;  Pazzani,  1990].  Such  models  are  clearly  powerful,  but  extremely 
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difficult  to  craft.  One  positive  and  unique  aspect  of  EXPO  is  that  it  is  able  to  find  the 
cause  of  a  failure  without  relying  on  such  models. 
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Chapter  3 

The  Role  of  Experimentation  in 
Planning 


As  we  saw  in  the  previous  chapter,  experimentation  techniques  have  been  used  for  learn¬ 
ing  in  various  contexts.  This  thesis  applies  exp>erimentation  to  learning  from  the  envi¬ 
ronment  in  order  for  a  planner  to  acquire  the  new  knowledge  necessary  to  accomplish 
each  new  task  at  hand.  The  purpose  of  this  chapter  is  to  explain  how  our  work  on 
experimentation  fits  into  the  context  of  planning. 

We  begin  by  describing  our  planning  paradigm,  and  the  type  of  domain  knowledge 
that  it  uses.  Then,  Section  3.2  describes  operators  as  concepts.  Because  concept  le£irning 
is  a  well  understood  framework  with  many  years  of  research  behind  it,  it  provides  a  useful 
perspective  on  the  automatic  refinement  of  operators.  One  important  point  is  that  a 
planner  is  given  an  initial  body  of  knowledge,  and  these  concepts  are  not  initially  empty. 
However,  the  initial  definitions  may  contain  various  types  of  imperfections  that  need  to 
be  understood  and  addressed  in  an  individual  basis.  Section  3.3  presents  four  types  of 
imperfections  that  may  occur  in  the  knowledge  base.  Interaction  with  the  environment 
to  acquire  new  knowledge  presents  many  issues  still  under  research.  Section  3.4  presents 
our  assumptions  and  states  the  limitations  of  our  approach  in  this  respect.  Then,  Section 
3.5  describes  precisely  our  definition  of  experimentation.  The  experimentation  process 
must  be  directed  and  efficient  and  this  section  explains  why  this  is  important  within  a 
planning  context.  Finally,  Section  3.6  presents  PRODIGY  [Minton  et  aL,  1989a;  Minton 
et  a/.,  1989b;  Carbonell  et  ai,  1991],  the  particular  system  used  for  the  implementation. 
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3.1  Domain  Knowledge  for  Planning 

Through  majiy  years  of  research  in  this  area,  different  paradigms  for  planning  have 
emerged,  including  the  problem  space  framework  [Newell  and  Simon,  1972],  case-b«ised 
planning  systems  [Kolodner,  1980;  Hammond,  1986;  Veloso,  1992],  and  plan  refinement 
[Schoppers,  1989].  This  work  concentrates  on  the  problem  space  framework.  The  plan¬ 
ner  is  given  a  set  of  rules  (called  operators),  each  of  which  defines  the  legal  transitions 
between  states.  Plans  are  found  by  searching  through  the  space  of  possible  states.  Many 
planners  have  used  this  model,  including  STRIPS  [Pikes  and  Nilsson,  1971],  NOAH  [Sac- 
erdoti,  1977],  and  SIPE  [Wilkins,  1988].  In  essence,  the  planner  is  given  a  set  of  operators 
that  model  the  possible  actions.  Each  operator  contains  the  conditions  under  which  the 
action  can  be  executed,  and  the  effects  of  the  action.  The  planner  is  also  given  an  initial 
state,  which  is  a  model  of  the  state  of  the  external  environment.  Operators  specify  the 
legal  transitions  from  one  state  to  another.  The  search  for  a  plan  consists  of  trying  dif¬ 
ferent  sequences  of  operators  to  reach  a  state  that  satisfies  a  given  goal  statement.  The 
operators  together  with  the  legal  states  constitute  the  domain  knowledge  of  the  planner. 

Consider  our  robot  planning  domain.  An  operator  for  opening  a  door  is; 

(OPEI 

(paraas  (<door>)) 

(praconds 

(and 

(ia-door  <door>) 

(onlockad  <door>) 

(n«zt-to  robot  <door>) 

(dr-clos«d  <door>) 

)) 

(alfacts  ( 

(dal  (dr-closad  <door>)) 

(add  (dr-opan  <door>)) 

))) 

The  variable  <door>  is  a  parameter  that  can  be  instantiated  to  open  particular  doors. 
The  preconditions  that  have  to  be  satisfied  in  order  to  open  a  door  are  that  the  robot 
is  next  to  a  door,  and  that  the  door  is  closed  and  unlocked.  The  effects  of  the  operator 
are  expressed  in  two  lists.  The  delete  list  (del)  specifies  the  facts  that  are  no  longer 
true  after  the  operator  is  applied.  The  add  list  (add)  is  comp>osed  of  the  facts  that  the 
application  of  the  operator  makes  true.  In  our  example,  after  opening  a  door,  the  door 
is  no  longer  closed  and  it  is  open.  To  open  door  Doorl2  we  use  OPEN  with  the  variable 
<door>  instantiated  to  Doorl2.  Doorl2  is  a  binding  for  the  parameter  <door>.  When  all 
the  preconditions  of  an  operator  are  satisfied  in  a  state,  then  the  operator  is  said  to  be 
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applicable.  An  operator  is  applied  by  changing  the  state  according  to  its  list  of  effects.  If 
the  current  state  Sa  contains  the  following  facts: 

(is-a  Box!  BOX) 

(is-door  Door 12) 

(in-room  ROBOT  Rooml) 

(in-room  BozA  Room2) 

(arm-empty) 

(connects  Doorl2  Rooml  Room2) 

(dr-closed  Door 12) 

(unlocked  Doorl2) 

(nezt-to  ROBOT  Doorl2) 

then  we  can  apply  the  operator  [OPEN  Doorl2]  and  obtain  the  following  state  Sb- 

(is-a  BoxA  BOX) 

(is-door  Doorl2) 

(in-room  ROBOT  Rooml) 

(in-room  BoxA  Room2) 

(emi-empty) 

(connects  Doorl2  Rooml  Room2) 

(dr-open  0oorl2) 

(unlocked  Doorl2) 

(next-to  ROBOT  Doorl2) 

Notice  that  the  operator  is  applicable  in  any  state  in  which  the  robot  is  next  to  a 
door  that  is  closed  and  unlocked.  The  preconditions  of  an  operator  represent  the  cl^lss 
of  states  in  which  the  operator  is  applicable.  In  contrast,  the  effects  do  not  express  the 
clciss  of  states  that  result  from  the  application  of  the  operator.  What  they  represent  is 
the  transformation  itself,  i.e.,  the  additions  and  deletions  that  must  be  done  on  the  state 
where  the  operator  is  applied.  This  cisymmetry  in  the  representation  of  the  operators 
must  be  taken  into  account  when  learning  domain  knowledge.  We  explain  why  in  the 
next  section. 


3.2  Refinement  of  Operators  as  Concept  Learning 

As  we  pointed  out  in  the  previous  section,  the  preconditions  of  an  operator  represent 
the  class  of  states  in  which  the  operator  is  applicable.  In  fact,  the  preconditions  form  a 
concept  that  expresses  the  (hepefully  minimal)  generalization  of  all  those  states.  Simi¬ 
larly,  the  effects  are  a  generalization  of  the  transition  between  states  that  the  operator 
represents.  This  means  that  learning  the  correct  expression  of  an  operator  is,  in  fact,  a 
matter  of  concept  learning  from  examples  [Michalski  et  al.,  1983].  This  section  shows 
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where  these  examples  come  from  and  how  they  can  be  used  to  learn  the  definition  of  an 
operator. 

Building  a  knowledge  base  is  a  process  that  requires  iteration  to  correct  errors  that 
keep  lurking  after  each  new  version  of  the  system.  When  users  define  operators  for  a 
planning  system,  it  is  not  uncommon  that  they  would  forget  to  write  a  precondition, 
or  a  side-effect  of  the  action.  Suppose  that  a  planner  is  given  the  following  incomplete 
operator: 

(OPEI* 

(params  («loor>)) 

(preconda 

(and 

(ia-door  <door>) 

;tha  condition  (unlocked  <door>)  is  missing 

(next-to  robot  <door>) 

(dr-closed  <door>) 

)) 

(effects  ( 

(del  (dr-closed  <door>)) 

(add  (dr-open  <door>)) 

))) 

Notice  the  missing  condition  (unlocked  <door>).  Now  suppose  that  the  planner 
is  given  the  goal  (dr-open  Doorl2)  in  state  Sa  (shown  in  the  previous  section).  The 
operator  OPEN’  can  be  applied  to  achieve  the  goal.  And  in  fact,  if  the  robot  tries  to  open 
the  real  door  represented  by  Doorl2,  the  door  will  open.  This  is  because  the  door  happens 
to  be  unlocked,  so  even  if  the  planner  is  unaware  of  the  missing  condition,  the  execution 
of  the  action  is  successful.  A  state  in  which  the  execution  of  the  action  is  successful  can 
be  considered  as  a  positive  example  of  the  concept  expressed  in  the  conditions  of  the 
operator. 

Consider  now  that  the  planner  is  given  the  goal  (dr-open  Door23)  and  the  following 
initial  state  Sc- 

(is-door  Door23) 

(n®xt-to  ROBOT  Door23) 

(unlockad  Door23) 

(closed  Door23) 

The  operator  OPEN’  can  be  applied  to  achieve  the  goal.  If  the  robot  tried  to  execute 
this  action  it  would  also  be  successful,  again  because  the  unknown  condition  that  the 
door  must  be  unlocked  happens  to  be  true  in  Sc-  In  fact,  this  state  is  another  positive 
example  of  the  concept  expressed  in  the  conditions.  We  can  generalize  from  states  Sa 
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and  Sc  by  replacing  the  constants  Doorl2  and  Door23  by  the  variable  <door>,  and  say 
that  a  door  can  be  opened  when  the  following  facts  are  true  in  a  state: 

(is-door  <door>) 

(n«xt-to  ROBOT  <door>) 

(unlocked  <door>) 

(closed  <door>) 

Now  suppose  that  the  goal  is  (dr-open  Door34),  and  the  state  Sp  is: 

(is-door  Door34) 

(next-to  ROBOT  Door34) 

(locked  Door34) 

(closed  Door34) 

This  time,  the  planner  will  also  believe  that  it  can  use  OPEN’  to  achieve  the  goal 
since  all  the  conditions  are  true  in  Sp-  However,  if  it  tries  to  execute  the  action  and 
open  the  door,  Door34  will  remain  closed.  This  is  because  this  time  the  door  does  not 
happen  to  be  unlocked.  Sp  can  be  considered  a  negative  example  of  the  concept  that 
the  preconditions  of  the  operator  represent. 

In  summary,  when  the  planner  is  given  the  ability  to  execute  actions  in  the  external 
world  and  observe  their  effects,  it  can  detect  faults  in  the  operators  that  model  these 
actions.  Each  successful  execution  of  the  action  corresponds  to  a  positive  example  of  the 
concept  that  the  precondition  expression  should  represent.  Similarly,  each  failure  is  a 
negative  example  of  that  concept.  So  in  fact,  the  problem  of  learning,  the  precondition 
expression  of  an  operator  can  be  cast  in  terms  of  concept  learning  as  follows: 

Given: 

a  set  of  positive  examples 

(i.e.,  a  set  of  states  in  which  the  action  was  successfully  executed) 
a  set  of  negative  examples 

(i.e.,  a  set  of  states  in  which  the  execution  of  the  action  failed) 

Find: 

a  description  that  covers  all  the  positive  examples  and  that 
does  not  cover  any  of  the  negative  examples 
(i.e.,  the  generalization  of  the  states  in  which  the  action 
can  be  successfully  executed) 


The  effects  of  an  operator  also  represent  a  concept.  This  concept  corresponds  to  the 
transformation  that  the  operator  causes  in  the  state  in  which  it  is  applied.  For  example, 
when  OPEN’  is  applied  in  5^,  the  following  transformations  occur: 


30 


CHAPTER  3.  THE  ROLE  OF  EXPERIMENTATION  IN  PLANNING 


(add  (dr-opan  Doorl2)) 

(dal  (dr-closad  Doorl2)) 

When  OPEN’  is  applied  in  Sc  ,  the  transformation  is: 

(add  (dr-open  Door23)) 

(del  (dr-closad  Door23)) 

A  generalization  of  these  two  examples  of  the  transformation  is: 

(add  (dr-open  <door>)) 

(dal  (dr-cloaed  <door>)) 

which  correspond,  in  fact,  to  the  effects  of  OPEN’.  If  some  effect  is  missing,  the  problem 
will  not  be  noticed  locally  (execution  will  be  successful),  but  may  be  noticed  later  when 
the  observed  world  state  diverges  from  the  predicted  one.  Notice  that  we  always  encounter 
positive  examples  of  the  transformation,  since  the  known  effects  always  occur  when  the 
conditions  are  true.  So  in  fact  the  problem  of  acquiring  the  effects  of  an  operator  is  also 
a  concept  learning  problem: 

Given: 

a  set  of  positive  examples 

(i.e.,  a  set  of  states  in  which  the  action  was  successfully  executed 
and  the  resulting  state) 

Find: 

a  description  that  covers  all  the  positive  examples 

(i.e.,  a  minimal  generalization  of  the  transition  between  the  states) 


There  are  some  references  in  the  literature  that  consider  the  left-hand  side  of  rules  as 
concepts  to  be  learned  [Mitchell,  1978;  Mitchell  et  al.,  1983;  Langley,  1987;  Langley  et  a/.. 
In  press].  However,  none  of  this  previous  work  heis  pointed  out  the  fact  that  the  effects 
of  operators  represent  a  concept  and  consequently  view  their  acquisition  as  a  concept 
learning  problem. 

Since  we  provide  the  planner  with  an  initial  domain,  there  is  an  initial  description  for 
the  concepts  of  the  precondition  expression  and  effects.  This  initial  description  may  be 
faulty  in  several  ways  that  are  described  next. 


3.3  Imperfections  in  Domain  Knowledge 

As  we  discussed  in  the  previous  section,  the  domain  model  that  the  planner  is  initially 
given  is  not  necessarily  perfect.  Several  types  of  imperfections  can  appear  simultaneously 
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in  a  domain  model.  There  have  been  several  attempts  to  classify  imperfections  [Mitchell 
et  al.,  1986,  Rajamoney  and  DeJong,  1987;  Huffman  et  ai,  1992].  This  section  presents 
a  more  exhaustive  classification  tailored  to  planning  systems.  For  each  imperfection,  we 
discuss  the  types  of  planning  failures  that  it  causes.  The  section  concludes  with  a  more 
detailed  description  of  the  imperfections  addressed  by  this  thesis. 


3.3.1  Incomplete  Models 

Incomplete  models  are  those  in  which  some  aspect  is  missing.  Known  operators  may  be 
missing  preconditions  and/or  effects.  Entire  operators  may  be  absent  from  the  model. 

Let  us  examine  first  the  case  of  incomplete  preconditions.  Consider  the  operator 
OPEN’  from  the  previous  section.  .Again,  OPEN’  is  incomplete;  it  is  missing  the  condi¬ 
tion  (unlocked  <door>).  As  we  saw  in  the  previous  section,  when  the  planner  executes 
OPEN’,  the  action  has  no  effects  when  the  door  happens  to  be  locked.  If  that  is  the 
case,  the  planner  makes  the  wrong  prediction  (that  the  door  will  be  open).  So  if  the 
preconditions  of  an  operator  are  incomplete,  the  planner’s  predictions  will  fail  because 
the  effects  of  the  operator  will  not  be  obtained. 

Now  let  us  look  at  a  case  when  the  effects  of  an  operator  are  incomplete.  Consider 
for  example  the  following  operator; 

(PUTDOWI’ 

(paraaia  (<ob>)) 

(praconds 

(holding  <ob>)) 

(effects 

((add  (arm-empty)) 

;the  effect  (del  (holding  <ob>))  is  missing 

(add  (next-to  robot  <ob>))))) 

Notice  that  the  operator  is  incomplete;  it  is  missing  the  effect  that  should  delete 
(holding  <ob>).  When  a  planner  executes  PUTDOWN’,  it  will  obtain  the  desired  effects. 
However,  it  will  continue  to  believe  that  the  robot  is  holding  the  object.  So  in  the  case 
of  incomplete  effects,  the  planner's  predictions  will  fail  when  the  wrong  fact  is  used  in 
the  future. 

Incomplete  effects  may  also  force  the  planner  to  do  unnecessary  work.  Consider  the 
following  operator; 

(PUTDOWI” 

(params  (<ob>)) 

(preconds 

(holding  <ob>)) 
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((add  (arm-empty)) 
(dal  (holding  <ob>)) 


)) 


;the  effect  (add  (next-to  robot  <ob>))  is  missing 


Now  suppose  that  the  planner  is  given  the  goal  (and  (arm-empty)  (next-to  robot 
BoxA)  )  when  the  robot  is  holding  BoxA.  The  planner  builds  a  two  step  plan  that  uses 
PUTDOWN”  first  to  achieve  (arm-empty)  and  then  GOTO-OBJ  to  achieve  (next-to 
robot  BoxA).  Notice  that  this  last  step  is  unnecessary,  but  the  planner  believes  it  is 
needed  because  it  ignores  the  fact  that  PUTDOWN”  also  achieves  (next-to  robot 
BoxA).  Thus,  unknown  effects  may  cause  the  planner  to  build  unnecessary  subplans. 

A  domain  model  is  also  incomplete  when  entire  operators  are  missing.  For  example, 
suppose  that  no  operator  is  available  for  opening  doors.  In  this  case,  the  planner  has 
strong  limitations  cis  to  the  problems  that  it  can  solve. 

Another  case  of  incompleteness  occurs  when  a  state  is  missing  facts  about  the  world. 
For  example,  consider  a  state  containing  a  description  of  a  door  Door45  that  connects 
Room4  and  RoomS.  The  state  does  not  contain  information  about  the  door  being  either 
locked  or  unlocked.  In  this  case,  some  operator’s  preconditions  will  not  be  matched  in 
the  state.  So  when  facts  are  missing  from  the  state,  the  applicability  of  operators  is 
restricted  to  the  known  facts. 


3.3.2  Incorrect  Models 


Incorrect  models  have  some  aspect  that  does  not  correspond  to  reality,  or  contain  overly 
specific  knowledge.  This  happens  when  an  operator  has  erroneous  conditions  or  effects, 
or  some  conditions  or  effects  that  are  overly  specific. 

Let  us  consider  the  first  case  of  erroneous  conditions. 


(OPEI” 

(params  (<door>) 

(preconds 

(and  (is-door  <door>) 

(next-to  robot  <door>) 
(unlocked  <door>) 
(dr-closed  <door>) 
(holding  <door>))) 
(effects 

((del  (dr-closed  <door>)) 
(add  (dr-open  <door>))))) 


;  this  condition  is  erroneous 
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Notice  that  this  operator  has  an  incorrect  condition:  it  requires  that  the  robot  is 
holding  the  door.  When  a  planner  tries  to  use  OPEN”  in  a  plan  it  will  always  fail,  since 
there  is  no  way  for  the  robot  to  be  holding  the  door.  So  when  a  condition  is  erroneous, 
it  may  not  be  possible  to  use  the  operator  to  construct  a  plan. 

Let  us  look  at  another  case  of  erroneous  conditions.  Consider  the  following  operator: 

(OPEI* ’ ' 

(parans  (<door>) 

(preconds 

(and  (is-door  <door>) 

(next-to  robot  <door>) 

(unlocked  <door>) 

(dr-cloaed  <door>) 

(next-to  <box>  <door>)))  ;  this  condition  is  erroneous 

(elfects 

((del  (dr-closed  <door>)) 

(add  (dr-open  <door>))))) 

In  this  case,  the  erroneous  condition  can  be  achieved  by  the  planner,  so  this  operator 
can  be  used  to  construct  a  plan.  However,  the  part  of  the  plan  that  places  the  box  next 
to  the  door  is,  as  we  know,  totally  unnecessary  for  opening  the  door.  So  an  erroneous 
condition  may  force  the  planner  to  create  plans  that  are  longer  than  needed  in  order  to 
achieve  unnecessary  subgoals. 

Now  let  us  look  at  the  case  of  overly  specific  conditions.  Consider  for  example  the 
following  operator: 

(OPEI’ ’ ’ ’ 

(params  (<door>) 

(preconds 

(and  (is-door  <door>) 

(next-to  robot  <door>) 

(unlocked  <door>) 

(dr-closed  <door>) 

(color-of  <door>  RED)))  ;  this  condition  is  ovarly  specific 

(effects 

((del  (dr-closed  <door>)) 

(add  (dr-open  <door>))))) 

The  predicate  (color-of  <door>  RED)  is  unnecessary,  making  the  precondition  ex¬ 
pression  too  specific,  since  the  operator  can  only  be  used  when  the  door  to  be  opened 
is  red.  Non-red  doors  can  never  be  opened.  So  when  a  condition  of  an  operator  is 
overly  specific,  the  planner’s  capabilities  are  restricted  with  the  more  limited  range  of 
applicability  of  the  operator. 
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The  facts  that  the  state  contains  can  be  incorrect  as  well.  For  example,  the  planner 
may  contain  the  fact  (locked  0oorl2)  when  the  door  is,  in  fact,  unlocked.  In  this 
case,  some  operator’s  preconditions  will  be  matched  in  the  state  when  the  action  is  not 
applicable,  and  vice  versa. 


3.3.3  Inadequate  Models 

Inadequate  models  are  those  whose  language  lacks  the  appropriate  primitives  to  express 
the  cispects  of  the  external  world  that  are  needed  for  problem  solving.  Consider  OPEN’.  If 
the  predicate  (unlocked  <door>)  was  not  only  missing  from  the  preconditions  but  also 
did  not  exist  in  this  domain,  the  planner  would  not  be  able  to  reason  about  locks  in  the 
doors,  thus  failing  to  open  any  locked  door. 


3.3.4  Intractable  Models 

Intractable  models  are  those  in  which  it  is  prohibitively  expensive  (time-consuming)  to 
derive  a  plan.  In  this  case,  control  knowledge  is  needed  to  direct  the  search.  As  we 
mentioned  in  Chapter  2  much  research  has  been  done  to  address  intractable  domain 
models  by  learning  control  knowledge  to  expand  the  boundary  of  problems  solvable  with 
given  time  restrictions. 


3.3.5  Types  of  Incompleteness 

This  thesis  is  concerned  with  refining  incomplete  theories  only.  Learning  when  the  given 
domain  is  incorrect,  inadequate,  or  intractable  will  be  discussed  briefly  in  the  future 
work  section.  Notice  that  inadequate  and  intractable  models  can  be  considered  incom¬ 
plete.  since  they  are  in  fact  missing  some  aspect  of  the  external  world.  They  are  listed 
separately,  however,  because  they  are  best  addressed  with  different  mechanisms. 

A  domain  theory  may  be  incomplete  in  several  ways: 

•  Operators  may  be  partially  specified — the  planner  may  know  only  some  of  their 
preconditions  and  some  of  their  consequences. 

•  Entire  operators  may  be  missing — the  planner  may  not  know  all  its  capabilities. 

•  Object  types  or  instances  may  not  appear  in  the  description  of  the  state — knowledge 
about  the  objects  that  must  be  manipulated  may  be  missing.  The  operators  may 
not  contain  enough  information  about  which  object  types  they  may  be  applied  to 
in  order  to  achieve  the  desired  effects. 
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•  Attributes  of  objects  in  the  world  may  be  unknown — Attributes  of  objects  can 
be  combined  to  form  new  attributes.  For  example,  mass  and  volume  define  the 
attribute  weight  via  a  formula.  The  range  of  values  that  already  known  attributes 
can  take  may  be  further  specified. 

•  Factuai  properties  may  be  missing  from  the  state — the  concrete  value  of  an  attribute 
of  some  object  is  unknown  (e.g.,  size,  color,  weight,  category...) 

Section  3.3.1  contains  examples  of  the  first  and  last  cases.  As  we  saw  in  that  section, 
each  case  causes  a  different  type  of  planning  failure.  This  is  why  each  case  needs  to 
be  addressed  differently.  Chapter  5  describes  methods  for  detecting  different  types  of 
failures  and  how  to  adjust  the  domain  knowledge  in  each  one  of  the  above  cases.  There 
are  several  ways  to  detect  and  refine  incomplete  knowledge.  One  is  to  rely  on  a  human 
to  build  the  knowledge  iteratively  by  testing  it  on  sample  problems  and  correcting  errors 
by  hand.  Another  is  to  have  the  system  learn  autonomously  by  interacting  with  the 
environment,  ais  the  next  section  describes. 


3.4  Learning  from  the  Environment 

A  planner  is  a  problem-solving  engine  typically  used  in  applications  that  involve  physical 
systems.  Some  examples  are: 

•  Path  planning  [Brady,  1982],  which  involves  finding  a  route  for  a  robot  controller. 

•  Process  planning  [Chang  and  Wysk,  1985],  where  the  planner  is  given  a  specification 
of  a  product  and  finds  a  sequence  of  operations  to  manufacture  it. 

•  Using  plans  for  understanding  natural  language  [Wilensky,  1981],  where  information 
about  an  agent’s  goals  and  plans  proves  to  be  very  useful  for  interpreting  stories. 

The  resulting  plans  represent  sequences  of  actions  that,  once  executed,  transform  the 
current  state  of  the  physical  system  (also  called  environment)  into  a  desired  state.  Thus, 
the  domain  knowledge  of  a  planner  models  the  external  system  in  order  to  reason  about 
its  behavior  and  act  accordingly.  The  operators  constitute  the  planner’s  knowledge  of 
how  to  affect  its  environment.  The  domain  model  is  a  good  representation  of  the  external 
processes  if  it  allows  the  planner  to  extract  all  conclusions  that  are  relevant  or  necessary 
for  its  task.  In  other  words,  a  good  model  encompasses  what  is  expected  from  the  external 
system. 

Any  disagreement  between  these  expectations  and  the  results  of  the  external  pro¬ 
cesses  indicates  an  imperfection  in  the  model  (of  some  of  the  types  indicated  in  Section 
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Figure  3.1:  An  intelligent  agent  interacts  with  the  world.  Operators  correspond  to  ac¬ 
tions.  The  external  state  is  represented  by  an  internal  state. 

3.3).  Any  autonomous  system  must  be  able  to  observe  its  environment  and  to  adjust 
its  internal  model  when  expectation  failures  occur.  Many  times  it  is  not  clear  which 
fault  in  the  model  caused  the  wrong  prediction.  It  may  be  necessary  to  perform  a  series 
of  directed  manipulations  of  the  external  system  in  order  to  collect  more  observations 
related  to  the  failure.  These  directed  manipulations  are  what  we  call  experiments,  and 
their  purpose  is  to  gather  enough  data  for  the  system  to  update  its  imperfect  model. 
In  summary,  observing  and  manipulating  the  environment  is  necessary  for  this  type  of 
learning  to  occur.  These  interactions  with  the  environment  raise  mtiny  issues  currently 
under  research.  This  section  describes  the  particular  limitations  of  our  system  that  are 
directly  related  to  its  interaction  with  the  environment. 

3.4.1  Interaction  with  an  External  Environment 

Figure  3.1  shows  an  intelligent  system  that  has  the  ability  to  interact  with  some  external 
system,  also  referred  to  cis  external  world  or  environment. 

Operators  are  internal  models  of  external  actions.  Operators  are  applied  by  updating 
the  internal  state  according  to  their  effects.  The  action  that  corresponds  to  the  operator 
is  executed  always  in  the  external  xcorld. 

Definition.  The  execution  of  an  action  succeeds  if  all  the  known  effects 
of  the  corresponding  operator  happen  in  the  external  world  when  all  the 
preconditions  are  satisfied. 

Definition.  The  execution  of  an  action  fails  if  some  effects  of  the  corre¬ 
sponding  operator  did  not  happen  as  expected  when  all  the  preconditions  are 
satisfied. 

When  a  goal  is  given  to  the  planner,  it  designs  a  plan  to  achieve  that  goal.  Then  the 
plan  is  executed  step  by  step.  Each  step  is  an  operator  whose  corresponding  action  must 
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be  executed  in  the  environment.  Whenever  the  system  decides  to  execute  an  action  in  the 
external  world  it  is  always  the  case  that  the  internal  state  indicates  that  the  corresponding 
operator  is  applicable.  At  this  point  the  system  first  checks  if  the  preconditions  of  the 
operator  are  indeed  satisfied  in  the  extern«il  world.  If  the  model  is  correct  then  the 
check  will  be  positive,  and  the  action  is  executed  in  the  external  world.  Then  the  system 
checks  if  it  has  been  in  fact  executed  correctly  by  checking  the  effects  of  the  operator  in 
the  external  world.  If  the  model  is  correct  then  the  execution  will  be  successful;  whatever 
the  goal  of  the  system  is,  it  is  achieved  after  the  execution  of  the  sequence  of  actions 
proposed  by  the  planner. 

Notice  that  in  this  scheme,  the  system  is  not  necessarily  observing  all  possibly  ob¬ 
servable  facts  about  the  external  state.  Us  attention  is  focused  only  on  the  facts  that  are 
relevant  to  the  application  of  the  action,  which  are  precisely  the  predicates  included  in 
the  preconditions  and  the  effects  of  the  corresponding  operator. 

The  system  always  h<is  some  expectations  about  the  world,  and  they  are  represented 
by  the  internal  state.  The  observations  that  the  system  can  make  correspond  to  the 
real  state  of  the  external  world.  In  order  to  know  if  the  model  is  accurate,  the  system 
compares  its  expectations  with  its  observations.  When  there  is  a  difference  between 
the  system’s  expectations  and  its  observations,  then  some  fault  in  the  model  has  been 
detected  and  there  is  opportunity  for  learning  how  to  correct  it. 

One  possible  cause  for  a  difference  between  expectations  and  observations  is  the  pres¬ 
ence  of  other  agents  that  can  interact  with  the  same  environment.  If  there  are  other 
agents,  then  the  cause  of  the  difference  might  not  be  a  fault  in  the  model.  The  internal 
state  of  the  agent  is  not  updated  with  the  effects  of  actions  that  are  executed  by  other 
agents  inadvertently.  If  no  cause  for  the  difference  is  found,  the  system  should  consider 
that  some  action  wa.s  executed  without  its  knowledge,  and  update  its  internal  state  ac¬ 
cordingly.  Another  possible  source  for  a  difference  are  nondeterministic  environments,  in 
which  the  outcome  of  an  action  under  the  same  circumstances  can  be  different.  Noisy 
sensors  can  signal  unexpected  observations  that  do  not  correspond  to  the  real  external 
state.  This  work  does  not  consider  any  of  these  possibilities. 

The  actions  that  the  agent  performs  are  considered  to  be  independent.  This  means 
that  the  results  of  an  action  can  be  observed  immediately  after  it  is  executed  and  their 
results  do  not  depend  on  the  actions  executed  previously.  This  last  eissumption  simplifies 
the  problem  enormously.  Fortunately,  it  holds  in  most  planning  domains. 
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3.4.2  Simulator 

For  our  implementation,  we  built  a  simulator  of  the  external  environment.  The  simulator 
uses  a  complete  and  correct  set  of  operators  to  model  the  avciilable  actions*,  as  well 
as  a  state  to  represent  the  external  state.  In  addition  to  the  domain  operators,  the 
simulator  is  also  given  operators  to  simulate  failure  conditions.  So  if  the  preconditix..  f 
an  operator  O  are  (pi  Ap2  Ap3),  a  failure  operator  can  be  constructed  with  the  condition 
(~  plV  ~  p2V  ~  p3)  and  the  effects  to  be  obtained  when  one  or  more  conditions  are 
not  true.  For  example,  a  failure  operator  would  represent  the  action  of  opening  the  door 
when  the  door  is  locked.  When  an  observation  is  requested  from  the  simulator,  it  is 
obtained  from  the  state.  When  an  operator  must  be  executed,  the  simulator  applies  to 
its  state  the  simulator’s  operators  whose  conditions  match. 

In  our  simulations,  the  failure  operators  do  not  have  any  effects.  In  some  domains, 
executing  these  operators  may  have  spurious  effects.  For  example,  consider  a  drilling 
operator  in  the  process  planning  domain.  Suppose  that  the  presence  of  cutting  fluid  is 
a  necessary  condition  for  drilling,  since  it  absorbs  the  heat  produced  by  the  operation. 
If  that  condition  is  missing  from  the  drilling  operator,  the  failure  operator  used  by  the 
simulator  should  have  the  effects  that  this  operation  has  in  the  real  world,  i.e.,  that  the 
drill  bit  is  damaged  by  the  excess  of  heat  as  well  as  the  part. 

Our  simulator  did  not  represent  noise  in  observations,  nor  spurious  effects  that  the 
execution  of  an  erroneous  operator  may  have.  This  is  not  a  very  sophisticated  scheme  to 
model  the  complexity  of  the  real  world,  but  it  provides  the  types  of  externed  interactions 
necessary  for  experimentation. 


3.5  Experimentation 

As  we  saw  in  the  previous  section,  the  interaction  with  the  external  world  is  a  powerful 
tool  for  acquiring  new  domain  knowledge.  The  directed  manipulation  of  the  environment 
through  experiments  makes  the  learner  proactive  and  reactive  in  the  learning  process. 
This  section  describes  what  experiments  mean  in  this  thesis,  why  they  facilitate  enor¬ 
mously  the  learning  task,  and  what  is  involved  in  the  formulation  of  experiments. 

3.5.1  Task-driven  Experimentation 

In  recent  years,  the  topic  of  experimentation  has  received  significant  attention  in  Artificial 
Intelligence.  The  range  of  concepts  embraced  by  the  word  “experimentation”  is  so  broad 

'Notice  that  neither  EXPO  nor  the  planner  have  access  to  this  complete  domain,  which  is  used  solely 
for  the  simulation. 
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that  it  is  not  possible  to  give  an  operational  definition  that  includes  them  all.  Scientists, 
philosophers,  and  psychologists  have  used  this  term  in  such  diverse  contexts  that  any 
attempt  to  reconcile  the  various  perspectives  is  doomed  to  failure.  Even  in  the  field 
of  Artificial  Intelligence  there  are  different  ways  of  understanding  the  term.  Figure  3.2 
presents  a  classification  under  which  the  different  interpretations  of  experimentation  may 
be  grouped. 

The  broadest  definition  of  experimentation  includes  thought  experiments  (also  called 
Gedanken  experiments).  These  include  any  mental  supposition  followed  by  its  mental 
test.  For  example,  we  all  do  this  kind  of  experimentation  when  trying  to  solve  some  prob¬ 
lem  that  requires  making  suppositions  and  figuring  out  what  would  happen  if  they  were 
made  true.  When  the  test  is  actually  performed  in  some  way,  then  the  experimentation 
is  active  and  usually  involves  an  action  in  the  external  world. 

Purposeful  experimentation  can  be  intentional  or  curiosity-driven.  Many  of  the  actions 
taken  by  children  at  play  are  of  the  latter  kind,  where  actions  are  applied  just  to  see  what 
happens,  just  to  determine  their  effects.  Pure  curiosity  can  lead  to  the  exploration  of  the 
consequences  of  the  set  of  actions  available.  In  this  case,  surprises  can  trigger  experiments 
that  have  some  intention  by  themselves.  Another  purpose  of  this  kind  of  experimentation 
can  be  to  analyze  the  consequences  of  certain  actions  that  have  shown  to  be  interesting 
for  the  system.  This  means  that  it  will  be  able  to  gather  knowledge  from  the  experiment 
that  the  system  may  otherwise  be  missing.  Paissive  observations  of  the  actions  performed 
by  another  entity  could  be  included  in  this  group. 

Task’driven  experiments  imply  deliberately  provoking  some  change  in  external  condi¬ 
tions  when  an  experiment  is  performed  as  a  means  to  gather  knowledge  that  is  necessary 
to  achieve  a  previously  set  goal.  The  consequences  of  such  deliberate  actions  are  ob¬ 
served  and  the  system  corrects  its  knowledge  to  adjust  it  so  as  to  match  more  closely  its 
environment.  The  experiments  are  directed  to  find  the  knowledge  that  the  system  needs 
to  solve  the  task.  Task-driven  experimentation  describes  best  the  work  in  this  thesis, 
and  is  highlighted  in  Figure  3.2. 

Confirmation  experiments  are  performed  to  test  the  degree  of  validity  of  a  certain 
hypothesis.  In  this  case,  there  is  some  preconceived  knowledge  of  what  the  exact  conse¬ 
quences  might  be.  If  the  system  can  have  a  range  of  values  that  describe  the  credibility 
of  its  knowledge,  experimentation  can  be  useful  to  give  the  system  a  more  accurate  idea 
of  the  validity  of  each  belief.  Other  systems  can  accept  or  reject  a  hypothesis  on  the 
basis  of  a  single  experiment. 

A  particular  case  of  confirmation  experiments  is  the  scientific  method  (sometimes  also 
called  experimental  method)  in  which  experiments  are  designed  to  test  some  theory.  As 
Kuhn  [Kuhn,  1977]  described  them,  they  can  either  refute  or  confirm  a  theory,  but  never 
assure  its  complete  validity.  We  do  not  relate  any  of  our  current  research  to  this  definition 
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ExperinKntauon 

(including  thought  experiments,  physical  experiments, ...) 


Physical  experimentation:  actions  in  external  world 
(incluthng  serendipity,  idle  curiosity, ...) 


Purposeful  experimentation  (goal-driven): 
aaiuistion  or  confirmation  of  new  knowledge 


Figure  3.2:  What  is  Experimentation?  Our  operational  definition  is  task-driven  experi¬ 
mentation,  where  deliberate  changes  in  external  conditions  are  preformed  as  a  means  to 
gather  knowledge  that  is  necessary  to  achieve  a  previously  set  goal. 
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of  experimentation.  On  the  contrary,  we  will  explore  ways  in  which  experimentation  will 
allow  our  system  to  acquire  new  knowledge,  but  never  with  a  preconceived  theory  to  be 
confirmed  oi  refuted  by  the  outcome  of  the  experiments.  The  v/ord  experimentation  will 
be  dissociated  from  the  usual  interpretation  in  the  context  of  the  scientific  method.  This 
does  not  mean  a  total  separation,  however.  Many  of  the  early  chemistry  experiments, 
for  example,  lacked  theoretical  basis. 

Our  work  does  not  represent  an  effort  to  give  a  solution  to  the  global  problem  of 
automating  the  process  of  making  experiments  as  a  whole.  Rather,  we  focus  our  attention 
on  a  few  points  of  the  fairly  large  space  of  experimentation.  Here,  we  always  refer  to 
experimentation  in  an  active  planning  context:  there  is  a  goal,  a  state,  and  a  (partially) 
formulated  plan.  Experiments  are  task-driven,  always  directed  at  overcoming  a  current 
impasse  in  the  planning  processes  due  to  a  lack  of  domain  knowledge.  This  means  that 
the  description  of  the  world  that  is  learned  is  one  that  is  useful  for  solving  the  problems 
that  the  intelligent  system  must  solve.  We  never  learn  in  this  framework  any  properties  of 
the  world  irrelevant  to  the  problem-solving  task,  i.e.,  we  are  not  modeling  idle  curiosity. 
This  kind  of  task-driven  experimentation  gives  the  system  a  context  in  which  to  learn 
and  more  focused  information  for  the  experiments. 


3.5.2  Efficient  Experimentation 

When  expectations  and  observations  differ,  the  system  engages  in  an  expensive  process 
of  finding  what  knowledge  it  is  missing  that  would  account  for  the  difference.  Experi¬ 
mentation  can  be  described  eis  having  the  following  steps: 

* 

1.  Hypothesis  formation:  Find  possible  hypotheses  that  explain  the  phenomenon. 
It  is  not  necessary  to  enumerate  all  possibilities,  since  the  system  should  try  first 
the  most  plausible  ones.  Identifying  the  most  plausible  hypotheses  facilitates  the 
process  enormously,  but  it  is  also  a  complicated  matter. 

2.  Requirements  for  an  Experiment:  Decide  what  is  required  to  test  a  given 
hypothesis.  Testing  a  hypothesis  might  require  several  experiments. 

3.  Experiment:  Experiments  are  done  in  three  phases: 

(a)  Design:  In  order  to  obtain  the  data  that  the  system  needs,  an  experiment 
must  be  designed  with  the  appropriate  functionality.  Experiments  are  de¬ 
signed  following  the  requirements  specified  in  Step  2,  and  instantiating  any 
variables  that  are  not  constrained  by  the  requirements.  If  many  experiments 
are  possible,  one  must  be  chosen.  The  design  phase  includes  planning  to 
achieve  the  state  where  the  experiment  is  to  be  performed. 
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(b)  Execution:  Once  designed,  the  experiment  can  be  carried  out  on  the  external 
environment. 

(c)  Observation:  After  the  experiment  has  been  performed,  the  system  obtains 
feedback  from  the  external  world. 

4.  Analysis:  When  the  results  of  experiments  are  analyzed  the  system  might  have 
found  the  information  that  it  sought.  If  not  then  it  might  design  and  perform  more 
experiments,  or  go  back  to  the  hypothesis  formation  stage  to  revise  its  hypotheses. 

5.  Confirmation:  Confirmation  experiments  may  be  designed  and  carried  out  to 
corroborate  the  hypotheses  emerging  from  the  results  of  the  experiments  just  per¬ 
formed. 

6.  Acquisition:  Based  on  the  observations,  the  system  might  or  might  not  change  its 
current  knowledge.  Possible  changes  include  correcting  what  is  inaccurate,  adding 
missing  information,  and  confirming  existing  knowledge. 

7.  Recovery:  The  state  of  the  world  before  the  experiment  was  performed  might 
have  to  be  restored.  Performing  an  experiment  might  have  affected  the  initial  set 
of  goals  either  violating  goals  (negative  interactions)  or  achieving  goals  (positive 
interactions). 

The  cycle  of  steps  1  through  4  is  repeated  until  the  experiments  yield  the  information 
sought  or  the  system  decides  to  give  up  and  work  on  another  task. 

The  requirements  Erequirementj  for  experiments  that  result  from  Step  2  are  specified 
as  follows: 

•  Eoperator-  the  Operator  about  which  the  system  tries  to  collect  more  information. 

•  F current- state'-  State  the  system  is  currently  in. 

•  Eexper-state-  A  State  in  which  the  experiment  is  to  be  performed.  It  is  any  state  that 
matches  all  the  preconditions  of  the  operator,  plus  an  additional  set  of  conditions 
necessary  for  the  experiment  (usually  related  to  the  hypothesis  being  tested). 

•  Eohserve-  Observations  to  be  collected  before  and  after  the  action  that  corresponds 
to  Eoperator  executed. 

The  methods  for  learning  by  experimentation  in  Chapter  5  detect  expectation  failures, 
find  hypotheses  to  correct  them,  and  produce  Erequirements-  The  rest  of  the  experimenta¬ 
tion  stages  are  addressed  in  Chapter  4. 
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Many  hypotheses  can  be  plausible  for  any  given  phenomenon.  For  each  hypothesis,  we 
can  envision  many  possible  experiments.  Each  experiment  requires,  among  other  things, 
setting  the  environment  in  the  appropriate  state  to  perform  it.  This  involves  the  use  of 
the  planner  to  achieve  that  state.  Many  plans  may  be  possible,  each  involving  different 
resources.  Experiment  design  and  execution  can  be  costly.  Thus,  the  use  of  experimen¬ 
tation  requires  a  framework  where  the  most  promising  hypotheses  and  experiments  are 
considered  first. 


3.6  PRODIGY 

The  methodology  described  in  this  thesis  is  implemented  in  an  experimentation  system 
called  EXPO.  EXPO  uses  PRODIGY  [Minton  et  al.,  1989a;  Minton  et  ai,  1989b;  Carbonell 
et  ai,  1991]  as  the  underlying  planning  system.  PRODIGY  is  a  general-purpose  problem 
solver  that  serves  <is  a  testbed  for  planning  and  machine  learning  research.  The  central 
problem  solver  was  purposefully  designed  with  a  ”glciss-box”  approach;  all  the  steps 
taken,  all  the  decisions  made,  and  all  the  information  consulted  by  the  engine  are  available 
in  a  problem’s  trace.  This  is  a  very  useful  feature  for  any  learning  system,  since  there 
is  an  information  context  in  which  learning  can  take  place.  In  addition,  PRODIGY  is  a 
well-developed  and  thoroughly  tested  tool. 

This  section  first  presents  the  particular  description  language  that  PRODIGY  uses  to 
represent  domain  knowledge.  Then  it  describes  briefly  other  learning  methods  imple¬ 
mented  on  PRODIGY  to  discuss  their  relationship  with  EXPO. 


3.6.1  prodigy’s  Domain  Knowledge 

In  PRODIGY,  the  domain  knowledge  is  given  by  a  set  of  operators  and  inference  rules. 
The  operators  are  models  of  the  available  actions,  specifying  under  which  conditions 
(preconditions)  an  action  has  which  effects  (postconditions).  Inference  rules  are  used  to 
deduce  additional  information  from  the  state.  A  problem  is  given  by  an  internal  state, 
representing  the  current  state  of  the  world,  and  a  goal  statement.  PRODIGY  searches  for 
a  solution  using  backward  chaining  means-ends  analysis. 

The  preconditions  of  an  operator  are  represented  by  an  expression  in  a  special  type  of 
first-order  logic  called  PDL  (for  prodigy’s  Description  Language).  PDL  allows  negation, 
conjunction,  disjunction,  and  universal  and  existential  quantification.  The  effects  can  be 
primary  or  conditional  (when  their  application  depends  on  the  state  in  which  the  operator 
is  applied).  Figure  3.3  presents  a  BNF  description  for  PDL. 
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LOW-LEVEL  SYITAI: 
constant  ;=  ATOM 
variable  :=  <AT0N> 
predicate  ATOM 
tern  variable  I  constant  I  exp 
var-list  : =  (variable  variable  . . . ) 

SYITAX  FOR  FORMULAS: 

exp  :=  atoaic-exp  I  negated-exp  I  existential-exp  I  universal-exp  I 
conjunctive-exp  I  disjunctive-exp 

atoaic-exp  :=  (predicate  tera  term  . ) 

negated-exp  :=  (*  existential-exp)  t  (~  atoaic-exp) 
disjunctive-exp  :-  (OR  exp  exp  exp  ....) 
conjunctive-exp  (AID  exp  exp  exp  ....) 
existential-exp  :=  (EXISTS  var-list  generator  exp) 
universal-exp  :=  (FORALL  var-list  SUCH-THAT  generator  exp) 
generator  :=  atoaic-exp 

SYXTAX  FOR  OPERATORS: 
operator-naae  :=  ATOM 

siaple-elXect  (ADD  atoaic-exp)  I  (DEL  atoaic-exp) 
conditional-effect  :*  (IF  exp  Csiaple-effect]*) 
effect  :«  siaple-eff ect  I  conditional-effect 

operator  (operator-naae  (PRECOIDS  exp)  (EFFECTS  (  effect  effect  ...))) 

Figure  3.3:  prodigy’s  Description  Language  and  Operator’s  Syntax 

Inference  rules  are  used  in  PRODIGY  to  deduce  additional  facts  about  the  current 
state.  While  the  application  of  an  operator  produces  a  new  state,  the  application  of  an 
inference  rule  augments  the  facts  that  are  known  about  the  current  state.  The  predicates 
added  by  an  inference  rule  are  called  open  world,  and  are  only  computed  on  demand  by 
backward-chaining  on  the  rule.  Inference  rules,  unlike  operators,  do  not  correspond  to 
any  external  actions. 

PDL  allows  functions  to  be  part  of  the  preconditions  of  an  operator.  Consider,  for 
example,  the  following  operator: 

(PICKUP-OBJ 

(prsconditions 

(and  (anismpty) 

(nsxt-to  ROBOT  <obj>) 

(is-objsct  <obj>) 

(wsight-of  <obj>  <waight>) 

(Isss-than  <veight>  10))) 

(affects  ( 

(dal  (am-aapty)) 
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(del  (next-to  <obj>  <*other-obj-l>) ) 

(del  (next-to  <*other-obj-2>  <obj>)) 

(add  (holding  <’obj>))))) 

less-than  is  a  function  whose  two  arguments  range  over  the  real  numbers.  It  is 
written  as  a  Lisp  function,  and  it  returns  true  if  its  first  argument  is  smaller  than  the 
second  one.  The  possibility  of  including  functions  in  the  preconditions  makes  PDL  very 
powerful,  since  any  computable  function  can  be  used  as  a  precondition.  But  this  same 
property  makes  learning  more  difficult,  as  we  describe  in  Section  4.1. 


3.6.2  Learning  in  prodigy 

Figure  3.4  depicts  the  different  learning  modules  that  have  been  developed  for  PRODIGY. 

Learning  is  used  to  speed  up  problem  solving  through  the  automatic  acquisition  of 
episodes  useful  for  analogical  reasoning  [Veloso,  1992],  producing  abstraction  hierarchies 
[Knoblock,  1991],  and  learning  control  rules  [Minton,  1988;  Etzioni,  1990;  Perez  and 
Etzioni,  1992].  All  these  methods  are  designed  to  capture  control  knowledge  to  guide  the 
search  process.  The  domain  knowledge  is  never  changed. 

None  of  these  learning  methods  address  the  issue  of  how  the  domain  knowledge  is 
acquired.  In  PRODIGY  learning  at  the  knowledge  level  is  done  both  from  the  user  through 
an  apprentice- type  system  [Joseph,  1992]  and  from  the  environment  through  autonomous 
learning  via  experimentation  (cis  described  in  this  thesis).  The  APPRENTICE  system 
provides  a  user-friendly  iniwiface  for  defining  the  operators  and  the  problems  in  a  domain. 

EXPO  is  a  module  that  automatically  refines  a  knowledge  base  by  direct  interaction 
with  the  environment.  Given  some  initial  domain  knowledge  (defined  through  APPREN¬ 
TICE  or  by  any  other  way),  EXPO  monitors  plan  execution  to  detect  faults  in  the  op¬ 
erators.  Experimentation  is  used  to  correct  these  faults.  Learning  produces  new  and 
improved  definitions  of  the  operators.  Notice  that,  unlike  APPRENTICE,  EXPO  does 
not  require  interaction  with  a  user,  being  the  only  module  m  PRODIGY  that  learns  new 
domain  knowledge  autonomously. 
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Figure  3.4;  A  Schematic  Representation  of  PRODIGY.  EXPO  is  the  only  system  that 
acquires  new  domain  knowledge  autonomously. 


or 


Chapter  4 

The  Experimentation  Process:  Step 
by  Step 


This  chapter  describes  how  to  detect  faults  in  a  planner’s  domain  knowledge,  and  how  to 
design  experiments  to  pinpoint  the  faults  and  correct  the  domain.  The  experimentation 
process  is  described  in  detail  for  one  particular  case:  acquiring  new  preconditions  of 
operators.  The  chapter  presents  both  general  descriptions  of  the  techniques  used  and 
their  particular  implementation  in  EXPO. 

The  chapter  begins  describing  a  method  for  detecting  operators  that  are  missing 
some  preconditions.  Then  it  shows  how  to  construct  hypotheses  as  a  set  of  predicates 
representing  possible  new  preconditions  of  the  operator.  Section  4.3  describes  a  set  of 
heuristics  that  compare  the  hypotheses  and  choose  the  ones  most  likely  to  yield  the 
condition  missing  from  the  operator.  Section  4.4  describes  how  to  design  experiments  to 
test  each  chosen  hypothesis.  Experiment  design  is  cast  as  a  search  for  a  set  of  conditions 
necessary  to  (dis)confirm  the  hypothesis,  and  a  plan  to  bring  them  about.  Many  different 
criteria  considered  for  this  design  space  are  described  in  this  section  as  policies.  A 
combination  of  policies  forms  a  strategy,  which  guides  the  search  to  design  experiments 
that  meet  the  desired  criteria.  This  section  describes  two  very  different  strategies  used 
by  EXPO.  The  chapter  continues  describing  how  the  experimentation  process  is  carried 
out  until  the  missing  precondition  is  found,  and  how  problem  solving  is  continued  after 
learning  from  the  experiments.  The  chapter  ends  with  a  discussion  on  how  the  techniques 
described  compare  with  experimentation  techniques  of  other  systems. 
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4.1  Detecting  Missing  Preconditions 

Suppose  that  a  planner  is  given  the  incomplete  operator  from  the  process  planning  domain 
shown  in  Figure  4.1.  This  operator  models  the  process  of  grinding  a  mettillic  surface.  A 
grinder  holds  a  part  with  some  holding  device,  and,  using  a  grinding  wheel  as  a  tool,  it 
changes  the  size  of  the  part  along  a  selected  dimension.  This  representation  may  seem 
correct,  but  in  f<ict  the  system  will  find  additional  facts  that  are  required  through  its 
experience.  For  example,  the  operator  is  missing  the  precondition  that  the  grinder  must 
have  cutting  fluid.  Grinding  is  an  abrasive  operation  that  generates  heat  as  a  result  of 
the  friction  between  the  tool  and  the  part.  If  no  cutting  fluid  if  present  to  absorb  the 
heat,  then  the  grinding  process  will  not  produce  the  desired  size  (the  grinder  and  the 
part  will  overheat  instead.) 

(GRIID-IICONPLETE 

(preconditions 

(and 

(is-a  <aacliina>  GRIHDEft) 

(is-a  <tool>  GRIIDIHG'WHEEL) 

(ia-a  <part>  PART) 

(holding-tool  <aacbins>  <tool>) 

(sids-up-lor-nachining  <diB>  <8ide>) 

(holding  <auiehin«>  <holding-d«vics>  <part>  <side>))) 

(elTscts  ( 

(add  (surTacs-f inish  <part>  <side>  SMOOTH)) 

(add  (sizs-of  <part>  <din>  <valtte>))))) 

Figure  4.1:  .An  incomplete  model  of  grinding 

Suppose  that  the  system  is  trying  to  grind  a  part  to  make  its  length  smaller.  Before 
grinding  the  peu:t,  the  system  checks  that  the  preconditions  are  true  in  the  external 
world,  as  shown  in  Figure  4.2(a).  Since  the  observations  confirm  the  expectations,  the 
system  goes  ahead  and  applies  the  action  to  try  to  grind  the  part.  After  applying  it, 
the  postcondition  of  GRIND  is  checked  in  the  external  state.  The  size  of  the  part  has 
changed  to  be  of  size  fc,  but  the  surface  finish  is  not  as  it  was  expected,  as  shown  in 
Figure  4.2(b).  This  may  be  because  the  known  effect  that  specifies  the  new  surface  finish 
is  wrong,  or  because  the  operator  is  missing  a  necessary  precondition.  VVe  consider  the 
later  hypothesis  first,  that  some  unknown  precondition  is  not  true  in  the  state  and  thus 
the  grinding  action  is  not  working  as  the  given  operator  specifies. 

How  could  we  find  out  what  the  missing  precondition  is?  We  can  try  to  find  out  what 
conditions  were  true  in  an  earlier  successful  application  of  the  operator  that  are  not  true 
now.  Figure  4.2(c)  shows  a  previous  successful  situation  when  the  grinder  had  fluid  and 
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is-agrinderl  GRINraR 
is-awheell  GRINDING-WHEEL 
hoUiag-tooI  grinderl  wheell 
hpMing  grinderl  visel  parti 


is-a  grinderl  GRINDER 
is-a  wheell  GRINDING-WHEEL 
holding-tool  grinderl  wheell 
holding  grinderl  visel  panl 


External  State 
Internal  State 


(a) 


is-a  grinderl  GRINDER 
is-awheell  GRINDING-WHEEL 
holding-iaol  grinderl  wheell 
holding  grinderl  visel  parti 


I 


is-a  grinderl  GRINDER 
is-a  wheell  GRINDING- WHEEL 
holding-tool  grinderl  wheell 
holding  grinderl  visel  parti 


GRIND(partl.  LENGTH) 


T 


is-a  grinderl  GRINDER 
is-a  wheell  GRINDING-WHEEL 
hoiding-lt>ol  grinderl  wheell 
»»«Ming  grinderl  visel  parti 
siz&of  parti  LENGTH  k 
nrCace-finish  parti  sidel  ROUGH 


(b) 


Current  Internal  State 
(Application  Failure) 


is-a  grinderl  GRINDER 
is-a  wheell  GRINDING-WHEEL 
holding-tool  grinderl  wheell 
holding  grinderl  visel  parti 


(c) 


is-a  grinderl  GRINDER 
is-a  wheell  GRINDING-WHEEL 
holding-tool  grinderl  wheell 
holding  grinderl  visel  panl 
size-of  panl  LENGTH  k 
surface-finish  pan!  sidel  SMOOTH 


Past  Internal  State 
(Successful  Application) 


is-a  gnnder2  GRINDER 
is-a  wheel2  GRINDING-WHEEL 
holding-tool  gnnder2  wheel2 
holding  grinder2  viseZ  pan2 
has-Huid  grinderZ  _ 


Figure  4.2:  Finding  new  preconditions  of  grinding 


the  operation  worked.  The  system  now  puts  fluid  in  the  grinder,  and  tries  again  to  apply 
the  operator.  Now  the  action  is  successfully  applied,  and  the  operator  is  corrected. 

But  in  the  general  Ccise,  there  can  be  several  differences  between  the  state  in  which 
the  operator  is  applied  successfully  and  the  state  in  which  a  failure  happens.  Then, 
experimentation  is  needed  to  determine  which  one  of  the  differences  is  relevant  for  this 
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particular  failure.  The  method  for  learning  new  preconditions  is  summarized  in  Table  4.1. 
Notice  that  A(So/<i,  Scurrent)  contains  the  following  two  sets  of  predicates:  (1)  predicates 
in  Sold  that  are  not  in  Scurrent,  and  (2)  the  negation  of  predicates  in  Scurrent  that  are  not 
in  Sold-  So  this  method  accounts  for  learning  of  positive  as  well  as  negative  preconditions, 
depending  on  which  subset  of  the  differences  contains  the  relevant  condition. 


If  after  manipulating  the  world  the  effects  of  the  operator  O  are  not  true, 
then  hypothesize  that  a  precondition  of  the  operator  is  missing. 

1.  Select  candidate  preconditions.  The  candidate  set  ^{Soid,  Scurrent)  is 
formed  by  calculating  all  the  differences  between  the  most  similar  earlier 
state  in  the  previous  problem  solving  history  in  which  0  was  applied  suc¬ 
cessfully  (Sold)  and  the  current  state  Scurrent  (an  unsuccessful  application 
of  0). 

2.  Identify  missing  precondition.  Formulate  experiments  observing  if  the 
operator  is  successfully  applied  when  one  of  the  differences  P  is  true  in 
the  state.  Use  any  information  available  to  formulate  the  most  promising 
experiments  first.  In  absence  of  knowledge,  apply  a  divide  and  conquer 
strategy  to  isolate  the  precondition  from  A(So/rf,  Scurrent)- 

3.  Add  P  as  a  new  precondition  of  operator  0. 

Table  4.1:  Method  for  learning  new  preconditions.  When  the  effects  of  an  operator  do 
not  occur  in  the  external  world,  a  previous  successful  application  of  the  operator  is  used 
to  find  a  missing  condition  of  the  operator. 

This  set  of  hypotheses  does  not  necessarily  contain  the  relevant  condition  as  it  may 
not  be  represented  as  a  single  atomic  observable  expression.  Other  possible  hypotheses 
to  be  considered  as  candidate  conditions  are: 

•  Disjunctive  expressions  of  predicates 

•  Inferred  predicates  deduced  in  a  state  by  theorem  proving 

•  Quantified  expressions  of  some  predicates 

•  Predicates  that  are  never  observed  because  they  are  not  needed  for  planning  (i.e.. 
the  weight  of  a  box) 

•  A  functional  relation  of  several  predicate  arguments 
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So  if  the  cause  of  the  failure  is  not  found  after  experimenting  with  A(So/d,  Scurrent), 
then  these  additional  hypotheses  must  be  considered.  EXPO  does  not  expand  the  hy¬ 
potheses  further,  aind  it  confines  the  experiments  to  A(So/(i,  Scurrent)-  When  it  runs  out 
of  hypotheses,  it  gives  up  learning  and  continues  plan  execution. 


4.2  Constructing  the  Set  of  Hypotheses 

As  we  described  in  the  previous  section,  if  there  are  several  differences  between  the  success 
state  and  the  failure  state,  experimentation  is  needed  to  find  the  relevant  condition  for 
the  failure. 

Here  is  a  typical  set  obtained  by  EXPO.  In  this  case,  G  RIND  (grinder  1,  wheell,  visel, 
part?,  TOP)  is  successful  but  GRINDfgrinderl.  wheell,  visel,  part3,  TOP)  fails: 

(size-ol  <part>  WIDTH  3) 

(siz«-of  <part>  LEIGTH  7) 

(siz«-ol  <part>  HEIGHT  2.5) 

(m&t«rial-ot  <part>  BRASS) 

(haz-flnid  <macliina>) 

(sorTaca-linish  part26  <aid«>  SAWCUT) 

(holding  drilll  vis«2  part26  <side>) 

(aatarial-ol  part26  STEEL) 

(is-a  drilll  DRILL) 

(is-a  drill-bit 1  DRILL-BIT) 

(natarial-ol  part37  COPPER) 

(has-hola  part37  <8ida>) 

The  problem  can  now  be  specified  as  follows: 

Given:  an  operator  OP  that  h<is  an  incomplete  set  of  preconditions 

a  set  of  predicates  Candidates  that  contains  a  precondition  that  OP  is  missing 

Find:  which  predicate  in  Candidates  is  the  missing  precondition  of  OP 

If  all  the  predicates  in  Candidates  are  equally  likely  as  possible  new  conditions,  a 
divide  and  conquer  strategy  through  the  set  Candidates  is  the  most  appropriate  experi¬ 
mentation  strategy.  The  algorithm  is  described  in  Table  4.2.  Notice  that  if  the  cardinality 
of  Candidates  is  n,  this  algorithm  requires  log{n)  experiments.  Furthermore,  each  exper¬ 
iment  hcis  a  large  set  of  requirements.  Besides  Preconditions{0 P),  the  first  experiment 
requires  n/2  predicates  to  be  satisfied,  the  second  requires  n/4,  and  so  on  until  there 
is  only  one  predicate  left  (a  total  of  2n  —  I  predicates).  The  algorithm  always  requires 
log{n)  experiments  and  a  total  of  '2n  —  1  predicates  to  achieve.  The  planner  has  to  build 
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a  plan  to  set  the  environment  in  a  state  that  satisfies  that  many  predicates.  Apart  from 
the  planning  effort  involved,  the  execution  of  those  plans  raises  non-trivial  issues.  Plan 
execution  may  use  up  valuable  resources  (including  time),  produce  non-desirable  changes 
in  the  environment  that  are  hard  to  undo,  and  interfere  with  the  main  goals  of  the  sys¬ 
tem’s  task.  For  all  these  reasons,  it  is  important  to  minimize  the  number  of  experiments 
and  their  requirements. 

Divide_and_Conquer_Experimentation((9P,  Candidate) 

1.  N ewJCandidates  *-  {  } 

2.  Divide  Candidates  into  two  subsets  of  equal  cardinality:  Candidates^  and 
CandidatesB- 

3.  Prepare  experiment:  achieve  a  state  where  Preconditions{0  P)  A  Candidates  a  are 
satisfied. 

4.  Experiment:  execute  OP. 

5.  If  execution  is  successful,  then  New. Candidates  <-  CandidatesA  else 
New. Candidates  <-  CandidatesB 

6.  If  Cardinalityi New. Candidates)  =  I,  then  return  New. Candidates  else  Di- 
vide_and,Conquer( OP.  N ew. Candidates). 

Table  4.2:  Algorithm  for  divide  and  conquer  experimentation.  The  algorithm  divides 
the  set  of  candidates  into  two  sul)sets  of  the  same  size,  and  uses  an  experiment  to  find 
out  which  subset  contains  the  missing  precondition,  then  the  process  is  iterated  on  the 
subset  until  its  size  is  one.  The  algorithm  always  requires  log{n)  experiments  and  a  total 
of  2n  —  1  predicates  to  achieve. 

-Another  consideration  is  that  the  set  of  hypotheses  constructed  contains  many  can¬ 
didates  that  may  not  be  worth  exploring  unless  everything  else  fails.  In  the  hair  dryer 
example  of  Section  1.1  some  of  the  initial  candidate  hypotheses  were  the  time  of  the  day 
and  the  day  of  the  week.  In  the  set  of  hypotheses  above  for  the  GRIND  operator,  bogus 
hypotheses  include  ’’GRIND  fails  if  there  is  a  part  made  of  steel”  and  "GRIND  fails  if 
there  is  a  part  that  has  a  hole”.  .Additionally,  if  the  operator  is  missing  more  than  one 
condition,  the  algorithm  will  fail.  The  divide  and  conquer  algorithm  is  very  simple  to 
implement,  but  is  definitely  far  from  satisfactory.  If  any  information  is  available  to  de¬ 
termine  a  smaller  subset  of  Candidates  as  more  relevant,  the  experimentation  effort  may 
be  greatly  reduced.  In  particular,  if  we  could  devise  a  way  of  ranking  the  predicates  in 
Candidates  from  most  relevant  to  least,  then  each  candidate  could  be  tested  individually. 
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Such  an  informed  algorithm  is  shown  in  Table  4.3.  The  number  of  experiments  re¬ 
quired  is  inversely  proportional  to  the  competence  of  the  ranking  procedure.  And.  most 
importantly,  only  one  predicate  needs  to  be  satisfied  in  each  experiment  (apart  from 
Preconditions[0 P)).  On  average,  n/2  experiments  are  needed  In  the  worst  case  n  ex¬ 
periments  are  needed  each  involving  also  1  top  level  goal. 

Informed_Experimentation{OP,  Ranked jCandidates) 

1.  Current.Candidate  *-  Pop{  Ranked  JCandidates) 

2.  Prepare  experiment:  achieve  a  state  that  satisfies  Preconditions{OP)  A 
Current. Candidate. 

3.  Experiment:  execute  OP. 

4.  If  execution  is  successful  then  return  Current. Candidate  else  return  In- 
formed_Experimentation(OP.  Ranked.Candi dates). 

Table  4.3:  Algorithm  for  informed  experimentation.  The  candidates  most  likely  to  be 
relevant  are  ranked  higher.  In  average,  n/2  experiments  are  needed  (n  in  the  worst  case) 
and  each  involves  1  top  level  goal. 

Many  systems  discussed  in  Chapter  2  us?  causal  theories  or  other  types  of  background 
knowledge  to  build  explanations  that  lead  to  the  causes  of  the  failure.  EXPO  relies 
exclusively  on  the  knowledge  given  initially  for  planning.  This  means  that  the  learning 
occurs  even  when  no  causal,  structural,  or  common  sense  knowledge  (other  than  the  one 
embedded  in  the  domain  model)  is  available.  This  is  a  major  advantage,  since  we  do  not 
need  to  address  in  turn  the  acquisition  and  refinement  of  that  additional  and  necessarily 
complex  background  knowledge. 

In  summary,  any  information  that  may  be  used  to  rank  the  hypotheses  greatly  reduces 
the  experimentation  effort.  EXPO's  approach  is  to  use  heuristics  that  extract  any  such 
information  strictly  from  the  domain  knowledge  given  to  the  planner.  The  heuristics  for 
choosing  hypotheses  presented  in  the  next  section  are  a  step  in  this  direction. 


4.3  Choosing  Hypotheses:  Finding  Relevant  Con¬ 
ditions  for  Failure 

This  section  presents  different  ways  to  exploit  knowledge  about  the  planning  task  to 
evaluate  which  predicates  in  a  set  of  differences  are  more  likely  to  have  caused  the  failure. 
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The  section  begins  by  describing  three  heuristics  to  choose  hypotheses.  Then,  their 
implementation  in  EXPO  follows.  Section  4.6  presents  a  discussion  of  these  heuristics. 
Their  evaluation  is  presented  in  Chapter  6  together  with  other  empirical  results  for 
EXPO. 

4.3.1  Locality  of  Actions 

The  first  heuristic  is  the  locality  of  actions.  The  preconditions  and  effects  of  actions  are 
concentrated  locally,  usually  affecting  the  objects  under  direct  influence  of  the  action. 
In  our  example  we  axe  grinding  part?.  The  fact  that  this  part  is  made  of  BRASS  may 
be  relevant  to  the  failure  obtained.  However,  it  is  probably  not  important  that  paxt37 
is  made  of  COPPER.  This  means  that  we  can  select  the  predicates  in  the  set  related  to 
objects  that  the  operator  GRIND  refers  to  directly. 

This  locality  heuristic  is  implemented  considering  only  the  predicates  in  the  state  that 
contain  any  of  the  objects  included  in  the  bindings  of  the  paxcimeters  of  the  operator.  In 
our  example,  if  we  extract  the  predicates  that  include  any  of  {grinderl,  wheell,  visel, 
part?,  TOP}  we  obtain  the  following  subset: 

(siz«-ol  <part>  WIDTH  3) 

(siz«-of  <part>  LEIGTB  7) 

(siz«-ol  <part>  HEIGHT  2.S) 

(■at«rial-of  <part>  BRASS) 

(has-fluid  <fflacbina>) 

(surlace-f inish  part26  <sid«>  SAWCUT) 

(has-holc  part37  <sida>) 

(holding  drilll  visa2  part37  <side>) 

Notice  that  with  this  heuristic  we  eliminated  from  the  list  many  predicates  that  were 
in  fact  irrelevant  for  grinding.  For  example,  many  facts  about  parts  not  being  ground 
have  disappeared. 

This  heuristic  is  not  helpful  when  the  set  of  variables  that  appears  in  an  operator  is 
incomplete.  If  the  operator  for  grinding  lacks  any  predicates  that  have  to  do  with  the 
tool  being  used,  the  system  would  never  learn  that  the  tool  is  important  for  the  action. 
A  possible  way  around  this  problem  is  to  give  some  structured  knowledge  to  the  state. 
For  example,  to  have  information  in  the  state  about  where  everything  is,  and  what  things 
are  close  to  each  other.  In  this  work,  we  avoid  this  kind  of  approach  because  it  requires 
adding  to  the  system  knowledge  that  is  not  strictly  required  for  planning. 

Another  problem  is  that  this  heuristic  does  not  always  propose  relevant  differences. 
Consider  the  subset  of  differences  just  obtained.  Because  grinding  is  being  done  to 
part?,  all  the  facts  about  part?  could  be  relevant.  But  since  the  TOP  is  the  side  being 
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ground,  any  facts  that  have  to  do  with  TOP  are  also  considered  relevant.  This  includes 
for  example  the  fact  that  part37  has  a  hole  on  the  TOP,  which  is  not  relevant  to  the 
application  of  the  operator. 

4.3.2  Generalization  of  Experience 

Another  helpful  heuristic  is  generalization  of  past  experience.  Generalizing  successful 
situations  tells  us  what  predicates  appear  in  all  success  states.  This  summary  of  past 
experience  helps  us  to  locate  relevant  causes  of  failures. 

This  heuristic  is  implemented  by  generalizing  successful  situations  through  the  bind¬ 
ings  of  the  operator.  This  gives  us  the  set  of  predicates  that  have  appeared  in  all  of  them. 
After  removing  from  that  set  the  predicates  that  correspond  to  the  preconditions  of  the 
operator,  we  obtain  the  following  set: 

(natcrial-ol  <part>  BRASS) 

(sturlaca-liniah  <part>  <8ida>  SAUCUT) 

(haa-lluid  <machina>) 

Notice  that  this  set  is  much  smaller  than  the  one  in  the  previous  section,  where  we 
only  considered  a  single  success  situation.  When  the  system  encounters  more  successful 
situations,  then  the  set  of  differences  becomes  smaller. 

If  the  system  has  no  previous  experience  with  the  application  of  the  operator  this  gen¬ 
eralization  strategy  is  not  helpful.  This  strategy  also  fails  when  not  much  generalization 
can  be  extracted  from  successful  applications. 

-A  generalization  of  all  the  possible  situations  where  grinding  is  successfully  applied 
is  exactly  the  correct  precondition  expression  sought.  The  preconditions  of  an  operator 
express  the  sufficient  conditions  for  applying  the  operator,  and  represent  the  class  of 
states  in  which  the  operator  is  applicable.  Thus,  learning  the  precondition  expression  of 
an  operator  is  a  problem  of  concept  learning.  The  initial  precondition  expression  of  an 
operator  is  the  initial  description  of  the  concept.  Each  successful  execution  of  an  action  is 
a  positive  example  of  th  *  concept,  and  each  failure  a  negative  example.  Experimentation 
is  an  additional  source  of  examples,  and  it  provides  the  learner  with  the  ability  to  design 
instances  and  direct  the  learning. 

However,  this  concept  learning  is  simpler  due  to  common  simplifying  idealized  2is- 
sumptions  of  planning  tasks.  There  are  no  misclassified  examples.  The  effects  of  actions 
can  be  observed  immediately  after  execution.  The  observations  are  collected  through 
noise-free  sensors.  Under  these  assumptions,  our  classification  of  execution  success  and 
failure  never  produces  noisy  data.  As  far  as  the  language  used  for  expressing  the  con¬ 
cepts.  the  large  majority  of  the  precondition  expressions  in  operators  are  conjunctions 
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of  predicates  (or  negations  of  predicates).  This  is  because  actions  are  easier  to  express 
if  their  effects  under  different  conditions  are  described  in  separate  operators.  Disjunc¬ 
tions  can  be  (and  are)  expressed  explicitly  in  different  operators.  In  this  sense,  limiting 
learning  to  conjunctive  expressions  is  still  useful. 

4.3.3  The  Structure  of  Domain  Knowledge 

Operators  for  a  single  task  axe  often  closely  related  to  one  another.  Some  operators  are 
inverses,  i.e.,  they  undo  each  other’s  effects.  Some  operators  have  similar  effects,  but 
are  applied  under  different  conditions.  Both  of  these  relations  appear  in  the  machining 
domain.  There  are  operators  for  holding  a  part  with  a  certain  holding  device,  and  there 
are  operators  to  release  the  part  from  the  device.  There  are  operators  for  holding  a 
tool  in  a  machine,  and  operators  for  releasing  tools  from  machines.  The  operators  for 
drilling  are  all  similar  to  one  another.  So  are  the  operators  for  polishing  surfaces.  These 
relations  of  similarity  and  reversibility  constitute  the  heuristic  of  structural  regularity  of 
the  domain. 

Structural  similarity  will  help  us  identify  what  hypotheses  aie  more  plausible  by 
looking  at  similar  operators  to  the  operator  being  considered.  This  is  a  very  general 
idea,  and  it  can  be  used  for  learning  new  preconditions,  as  described  next. 

One  way  to  implement  this  heuristic  is  to  organize  the  operators  in  a  hierarchy,  so  that 
similar  operators  can  be  easily  located.  The  hierarchy  can  be  built  through  comparing 
the  preconditions  and  effects  of  operators.  In  our  machining  domain,  part  of  the  hierarchy 
that  includes  the  grinding  operation  is  shown  in  Figure  4.3. 


change-surface-finish  reduce-size 


Figure  4.3:  Part  of  the  operator  hierarchy  in  the  process  planning  domain. 

Consider  the  set  of  differences  obtained  in  the  previous  section  as  possible  candidates 
for  a  new  precondition  of  grinding.  Many  other  operators  change  the  size  of  a  part.  Many 
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of  them  require  the  use  of  cutting  fluid,  which  is  in  fact  the  relevant  condition  for  this 
particular  failure.^  Only  some  of  them  have  conditions  about  the  material  of  the  part. 
And  none  of  them  has  any  conditions  about  the  surface  finish  of  a  side  of  the  part.  The 
heuristic  suggests  that  the  differences  should  be  considered  in  the  following  order: 

1.  (has-fluid  <machine>) 

2.  (material-of  <part>  BRASS) 

3.  (surface-finish  <part>  <side>  SAWCUT) 

This  heuristic  is  not  very  helpful  if  there  are  no  similar  operators  or  if  there  are  similar 
operators  but  they  are  also  incomplete. 


4.3.4  Implementation 

This  section  describes  in  detail  the  algorithms  that  implement  in  EXPO  the  heuristics 
just  described. 

Each  execution  of  the  operator  is  either  a  success  or  a  failure.  As  Section  3.2  described, 
the  precondition  expression  of  an  operator  can  be  seen  as  a  concept  that  represents  the 
states  in  which  the  operator  can  be  executed  successfully.  A  state  in  which  a  successful 
execution  occurs  corresponds  to  a  positive  instance  of  the  concept,  and  a  state  in  which 
a  failure  is  obtained  is  a  negative  instance.  Each  constant  in  the  instances  must  be 
parameterized  according  to  the  bindings  of  the  operator.  For  example,  if  the  variable 
<part>  is  bound  to  parti  when  we  execute  GRIND,  and  the  state  contains  (material-of 
parti  BRASS)  we  would  like  the  concept  to  contain  a  more  generalized  version  of  this  fact, 
i.e.,  (material-of  <part>  BRASS).  EXPO  keeps  information  about  action  executions 
in  situations,  which  are  composed  of: 


•  Operator:  the  operator  whose  action  was  executed. 

•  Result:  the  result  of  the  e.xecution,  i.e.,  success  or  failure. 

•  Bindings:  the  list  of  bindings  for  the  operator  variables. 

•  State:  the  list  of  predicates  believed  to  be  true  immediately  prior  to  the  operator 
being  executed. 

‘Cutting  fluids  cool  both  the  cutting  edges  of  the  tool  and  the  part,  aid  in  chip  clearance,  and  improve 
the  surface  flnish.  Notice  that  a  great  deal  of  background  information  would  be  needed  to  explain  that 
the  presence  of  cutting  fluid  is  important  for  grinding. 
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To  generalize  from  experience,  EXPO  applies  the  algorithm  presented  in  Table  4.4. 
Given  two  situations,  their  generalization  is  a  new  situation  generated  as  follows.  First, 
the  corresponding  bindings  are  generalized.  Then  the  literals  in  the  state  are  changed 
substituting  the  constants  and  variables  according  to  the  new  bindings.  The  state  of  the 
generalization  includes  only  the  predicates  that  appear  in  the  generalized  states  of  both 
situations. 

Generalization(5i,  S2) 

•  Snew-Operator  *- Si.Operator 

•  Generate  Snew-Bindings  from  Si. Bindings  and  S2. Bindings 

-  the  generalization  of  a  variable  and  a  constant  is  the  variable  itself. 

-  the  generalization  of  two  different  constants  is  a  variable. 

-  the  generalization  of  two  equal  constants  is  the  constant  itself. 

•  Generate  /I  by  substituting  the  constants  and  variables  of  Si. Bindings  that  appear 
in  Si. State  by  the  bindings  in  Snew-Hindings 

•  Generate  B  by  substituting  the  constants  and  variables  of  S2.Bindings  that  appear 
in  S2.State  by  the  bindings  in  Snew-Hindings 

•  Snew -State  *-  An  B 

•  Return  Snew 


Table  4.4;  Algorithm  for  generalizing  two  successful  situations. 

Notice  that  this  generalization  algorithm  is  biased  to  produce  conjunctive  descriptions 
of  the  concept.  This  bias  is  appropriate  for  this  application.  The  large  majority  of  the 
precondition  expressions  in  operators  are  conjunctions  of  predicates  (or  negations  of 
predicates).  This  is  because  actions  are  easier  to  express  if  their  effects  under  different 
conditions  are  described  in  separate  operators.  In  this  sense,  even  if  the  system  aims 
to  learn  only  conjunctive  expressions  of  predicates  it  would  be  a  great  win.  In  fact, 
even  though  PRODIGY  allows  for  a  very  expressive  language  in  the  preconditions,  the 
generalization  only  contains  the  predicates  in  the  preconditions  that  are  part  of  the 
conjunct.  For  example,  if  the  precondition  expression  of  an  operator  is  (aind  (A  B  C  D 
(or  E  F)))  E  and  F  are  never  included  in  the  generalization. 

EXPO  maintains  the  current  description  of  each  operator’s  preconditions  as  a  version 
space  [Mitchell,  1978].  A  version  space  is  defined  within  a  lattice  of  concept  expressions 
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that  are  ord  ;red  from  more  general  to  more  spes^ific.  Concept  instances  are  the  most 
specific  expressions.  Successively  more  general  descriptions  are  found  at  higher  positions 
in  the  lattice.  A  version  space  is  defined  by  two  boundary  sets:  a  set  of  maximally 
specific  descriptions  (S)  and  a  set  of  minimally  specific  descriptions  (G).  A  version  space  is 
maintained  for  each  operator.  The  examples  correspond  to  situations  in  which  the  system 
tried  to  apply  the  operator.  Recall  that  successful  situations  are  positive  instances,  and 
failure  situations  are  counterexamples  or  negative  instances.  Given  two  situations  Si  and 
52,  Si  is  more  general  than  S2  if  both  of  the  following  hold: 

•  Each  literal  in  the  state  of  5i  has  a  corresponding  literal  in  the  state  of  52.  The 
correspondence  is  done  through  the  bindings  of  both  situations. 

•  The  bindings  of  5i  are  more  general.  A  variable  is  more  general  than  a  constant. 
If  two  constants  are  equal,  the  generalization  is  the  constant  itself. 

The  G  set,  the  most  general  description,  is  initialized  to  the  initial  preconditions  of 
the  operator  and  its  value  is  kept  to  the  current  preconditions.  The  5  set  is  updated 
as  new  success  situations  are  obtained  using  the  generalization  algorithm  just  described. 
When  a  new  failure  situation  is  obtained,  the  5  set  is  updated  by  removing  from  it 
any  conjuncts  that  also  appear  in  the  failure  state.  Because  G  is  always  the  current 
preconditions  of  the  operator,  G  always  covers  failure  situations  and  must  be  specialized. 
Instead  of  following  the  usual  procedure  for  updating  G  (which  is  highly  inefficient  when 
there  are  many  possible  new  conditions),  EXPO  waits  until  the  missing  condition  is  found 
through  experimentation,  and  then  adds  it  to  the  current  conjunct  in  the  G  set. 

The  version  spaces  implement  the  heuristic  for  selecting  hypotheses  based  on  its 
generalization  of  experience.  From  the  set  of  current  candidate  hypotheses,  only  the 
ones  that  appear  in  5  (the  ones  that  are  common  to  all  successful  situations)  and  do  not 
appear  in  G  (since  G  contains  the  preconditions,  they  appear  in  the  failure  state)  are 
selected. 

The  set  of  hypotheses  selected  by  the  generalization  heuristic  is  then  filtered  by  the 
locality  heuristic.  This  heuristic  sele-ts  only  the  hypotheses  that  contain  constants  and 
variables  that  appear  i:i  the  bindings  of  the  failure  situation.  This  new  subset  of  the 
hypotheses  is  then  ranked  by  the  heuristic  of  structural  similarity  as  we  explain  now. 

All  the  domain  operators  are  organized  by  EXPO  in  a  hierarchy  using  a  simple  clus¬ 
tering  algorithm  described  in  Table  4.5.  The  top  node  contains  all  the  operators  in  the 
hierarchy.  For  every  node,  the  operators  that  are  not  in  any  of  its  children  yet  are  exam¬ 
ined  to  build  a  child  node.  The  expression  or  expressions^  that  appear  in  a  larger  number 

^preconditions,  postconditions,  or  both.  In  our  experience  with  EXPO’s  domains,  this  does  not  make 
a  difference  in  the  effectiveness  of  the  structural  similarity  heuristic. 
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of  operators  define  the  child  node,  and  the  operators  that  contain  them  are  transferred 
to  it.  The  algorithm  works  its  way  down  in  the  tree  until  a  node  is  reached  that  contains 
only  one  operator  or  all  of  its  operators  expressions  are  included  in  the  node.  When  a  new 
condition  or  effect  for  a**  operator  is  learned,  the  hierarchy  is  updated  by  recomputing 
the  children  of  the  node  that  contains  the  node  operator. 

Build_Operator_Hierarchy  (Operators) 

1.  For  each  OP  €  Operators  do 

Expr(OP)  expressions  in  the  preconditions  and  effects  of  OP. 

2.  Open  «-  {}. 

.3.  Node.Ops  *r- Operators 

4.  Open  <-  Node 

5.  Repeat 

•  Node  Pop(Open) 

•  Node. Subtypes  *-  Produce-Subtypes(iVode) 

•  Node.Ops  ♦-  Node. Subtypes 

•  Push{Open,  Node. Subtypes) 

Until  Null{Open) 

Produce_Subtypes(Node) 

1.  Repeat 

(a)  Find  the  set  of  expressions  E  that  are  most  common  for  operators  in 
Node.Ops. 

(b)  Make  a  subtype  node  with  all  the  operators  in  Node.Ops  that  have  all  the 
expressions  in  E  and  remove  them  from  Node.Ops. 

Until  OpsJnSubtypes  =  Node.Ops. 

Table  4.5:  .Algorithm  for  building  an  operator  hierarchy. 

EXPO  considers  first  the  hypotheses  that  are  selected  by  the  three  heuristics.  Then, 
it  considers  the  ones  that  the  structural  regularity  rejected,  then  the  ones  rejected  by  the 


4.4.  THE  EXPERIMENTATION  SEARCH  SPACE 


61 


locality  heuristic.  Last,  EXPO  considers  the  rest  of  the  hypotheses  in  the  initial  set. 

Determining  the  missing  precondition  is  done  through  iterative  experimentation  with 
the  ranked  list  of  candidate  predicates.  In  EXPO,  this  process  converges  if  the  missing 
condition  is  an  observable  and  non-inferred  predicate  that  is  within  a  conjunctive  ex¬ 
pression.  If  this  is  the  case,  the  missing  condition  is  included  in  the  group  of  candidate 
hypotheses,  and  EXPO  eventually  encounters  it  and  learns  it  through  experimentation. 

Although  the  ailgorithms  presented  in  this  section  can  be  made  more  sophisticated, 
we  must  keep  in  mind  that  they  are  used  to  build  heuristics.  In  their  simplicity,  the 
results  in  Chapter  6  show  that  they  are  effective  for  implementing  these  heuristics. 


4.4  The  Experimentation  Search  Space 

The  previous  section  described  how  to  compare  hypotheses  heuristically  to  evaluate  which 
ones  are  more  promising.  Once  a  particular  hypothesis  is  chosen,  an  experiment  must 
be  designed  to  test  it.  In  our  particular  example,  the  heuristics  suggest  that  the  most 
promising  hypothesis  is  that  the  precondition  that  the  operator  GRIND  is  missing  is 
(has-fluid  <machine>). 

In  order  to  perform  an  experiment,  the  world  must  be  brought  to  a  state  where  the 
conditions  of  the  experiment  are  satisfied.  In  our  example,  we  must  reach  a  state  where 
the  current  known  preconditions  of  GRIND  and  the  hypothesized  new  condition  are 
satisfied.  In  other  words,  our  goal  is  to  reach  a  state  where  the  following  are  true: 

(exists  (<niachine>  <tool>  <part>  <dini>  <side>  <holding-device>) 

(and 

(is-a  <machine>  GRINDER) 

(is-a  <tool>  GRINDING-WHEEL) 

(is-a  <part>  PART) 

(holding-tool  <machine>  <tool>) 

(side-up-lor-machining  <dim>  <side>) 

(holding  <machine>  <holding-device>  <part>  <side>) 

(has-fluid  <machine>))) 

The  planner  must  first  come  up  with  a  plan  to  achieve  this  state  from  its  current 
state,  which  is  the  state  in  which  the  failure  occured  that  triggered  experimentation.  We 
call  this  search  process  pre-experimevt  planning. 

Once  the  pre-experiment  plan  is  executed,  the  experiment  can  be  carried  out.  In  our 
example,  we  GRIND  and  check  if  this  time  the  effects  specified  for  GRIND  are  obtained. 
If  not.  other  hypotheses  must  be  tested  with  other  experiments.  But  if  grinding  works 
now,  then  the  missing  condition  must  be  (has-fluid  <machine>).  The  new  condition  is 
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added  to  the  operator  GRIND.  Then,  the  original  plan  that  failed  must  be  continued  in 
order  to  achieve  the  original  goal.  If  the  pre-experiment  plan  hcis  undone  any  of  the  facts 
necessary  for  the  original  plan,  then  a  post-experiment  plan  is  needed  to  restore  those 
facts  and  continue  with  the  main  plan.  Whether  a  post-experiment  plan  is  used  to  enable 
the  continuation  of  the  original  plan  or  replanning  is  done  to  achieve  the  original  goals  is 
not  the  issue  here.  The  issue  is  that  there  is  some  effort  needed  to  restore  facts  that  were 
undone  during  pre-experiment  planning  and  we  call  that  post-experiment-planning. 

Clearly,  some  pre-experiment  plans  are  better  than  others.  Minimal  interference  with 
the  main  plan  is  important.  In  our  example,  it  would  be  better  to  use  another  holding 
device  for  the  experiments  since  visel  is  already  holding  parti.  So  maybe  using  grinder2, 
wheel2,  and  vise2  is  better.  But  perhaps  it  is  more  important  to  make  the  pre-experiment 
plan  as  short  as  possible,  so  we  can  recover  from  the  failure  and  go  on  with  our  main 
plan.  If  this  is  the  case,  maybe  using  grinderl,  wheell,  and  visel  is  better  since  they  are 
already  set  up  and  ready  for  grinding  operation.  So,  one  experiment  may  be  better  than 
another  one,  depending  on  what  policy  is  preferred. 

EXPO  designs  experiments  following  a  set  of  policies  chosen  by  the  user  from  a  pool. 
Each  policy  defines  a  preference  to  be  used  for  decision  making  and  can  be  thought  of 
as  a  piece  of  control  knowledge  to  be  used  during  experimentation  planning.  Policies  are 
grouped  together  to  define  strategies.  We  describe  now  EXPO’s  policies  and  strategies 
in  detail. 

4.4.1  Experiment  Policies 

The  experiment  policies  described  in  this  section  are  grouped  under  four  topics;  search 
depth  and  plan  length,  goal  interactions,  operator  properties,  and  binding  interactions. 
They  are  summarized  in  Figure  4.4.  .Notice  that  all  the  policies  described  in  this  section 
are  domain  independent. 

Search  Depth  and  Plan  Length 

Limiting  the  search  depth  helps  control  the  search  time.  Limiting  the  plan  length  helps 
control  the  execution  time. 

Each  level  of  a  search  involves  the  application  of  an  operator  or  an  inference  rule. 
An  inference  rule  represents  a  deduction  from  the  current  state,  whereas  an  operator 
represents  an  externally  executable  action.  The  final  plan  is  composed  only  of  actions. 
This  is  why  the  depth  of  the  search  does  not  correspond  to  the  length  of  the  plan, 
although  they  are  related. 
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•  Search  depth  and  plan  length 

-  Avoid  deep  nodes 

-  Prefer  shallow  nodes 

-  Avoid  long  plains 

-  Prefer  short  plans 

-  Avoid  plans  with  too  many  state  changes 

-  Prefer  plauis  with  fewer  state  changes 

•  Goad  interactions 

-  Support  main  goad  concord 

-  Avoid  main  goal  protection  violation 

-  Avoid  main  prerequisite  violation 

•  Operator  properties 

-  Avoid  irreversible  operators 

-  Prefer  reversible  operators 

-  Prefer  operators  that  minimize  state  changes 

-  Prefer  more  reliable  operators 

-  Avoid  unreliable  operators 

•  Binding  interactions 

-  Avoid  objects  of  very  high  protection 

-  Prefer  objects  of  lower  degree  of  protection 

-  Prefer  least  number  of  protected  objects 

Figure  4.4:  EXPO’s  experimentation  policies. 
expo’s  available  policies  that  concern  experimentation  search  depth  and  plan  length 

are; 

•  Avoid  deep  nodes;  Never  expand  nodes  below  a  certain  depth.  This  maximum 
depth  for  the  experimentation  search  must  be  given  a  value. 

•  Prefer  shallow  nodes;  Prefer  expanding  shallower  nodes. 

•  Avoid  long  plans;  Never  choose  plans  that  are  longer  than  a  given  length. 

•  Prefer  short  plans:  Prefer  plans  that  are  shorter. 
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•  Avoid  plans  with  too  many  state  changes:  Never  choose  plans  that  cause 
changes  in  the  external  world  over  a  given  number.  The  amount  of  changes  that  a 
plan  produces  in  the  sum  of  the  effects  of  the  operators  that  compose  it. 

•  Prefer  plans  with  fewer  state  changes:  Prefer  plans  that  cause  a  smaller 
amount  of  changes  in  the  external  world. 

Goal  Interactions 

The  goal  interaction  policies  refer  to  the  interactions  between  the  goals  in  the  experi¬ 
mentation  space  and  goals  in  the  main  search  space.  They  are  different  from  the  types 
of  interactions  within  a  search  space,  as  in  [Sussman.  1975;  Sacerdoti,  1977],  where  for 
example  goal  Gi  may  be  preferred  to  another  goal  if  achieving  Gi  first  causes  G2 
to  undo  Gi-  Here,  a  search  path  is  preferred  over  another  one  it  minimizes  negative 
interference  (or  maximizes  positive  interference)  with  the  top  level  goals.  Notice  that  the 
preference  is  over  which  search  paths  to  pursue,  not  over  which  goals. 

expo’s  policies  for  interactions  with  the  main  goals  are: 

•  Support  main  goal  concord:  If  a  search  path  achieves  a  goal  that  remains  to 
be  achieved  by  the  main  plan,  prefer  it  over  other  paths. 

•  Avoid  main  goal  protection  violation:  If  a  search  path  clobbers  a  goal  previ¬ 
ously  achieved  by  the  main  plan  that  is  still  needed  to  achieve  the  main  goals,  then 
prefer  other  search  paths  over  this  one. 

•  Avoid  main  prerequisite  violation:  If  a  search  path  undoes  a  fact  that  the 
remaining  main  plan  requires  to  be  true,  then  prefer  other  search  paths  to  this  one. 

Operator  Properties 

Local  decisions  about  which  operator  to  prefer  in  order  to  achieve  a  goal  may  be  based 
on  properties  of  the  candidate  operators.  Some  properties  may  be  domain  dependent, 
such  as  the  execution  time  of  the  operator  or  other  resources  involved  (see  Section  4.4.2 
for  more  details  on  domain-dependent  policies).  These  are  EXPO’s  policies  based  on 
domain-independent  properties  of  operators: 

•  Avoid  irreversible  operators:  Never  use  irreversible  operators.  Determining 
that  an  operator  is  irreversible  requires  proving  that  there  is  no  plan  that  can  undo 
its  effects.  This  is  at  least  undecidable.  since  planning  is  undecidable  [Chapman, 
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1987].  Also,  the  irreversibility  of  operators  is  not  a  binary  feature:  the  same  opera¬ 
tor  may  be  irreversible  in  some  states  and  reversible  in  others.  Because  of  these  and 
other  issues  that  make  the  automatic  determination  of  irreversibility  very  complex, 
EXPO  relies  on  a  user-defined  classification  of  operator’s  reversibility. 

•  Prefer  easily  reversible  operators:  If  the  effects  of  operator  0i  axe  easier  to 
undo  than  the  effects  of  operator  O2,  prefer  0\  over  Oj.  Determining  the  degree  of 
reversibility  of  an  operator  is  not  a  simple  matter,  so  EXPO  relies  on  an  ordered 
list  of  operators  defined  by  the  user. 

•  Prefer  operators  that  minimize  state  changes:  If  an  operator  0\  has  less 
effects  than  operator  O2,  prefer  0i  over  02-  This  policy  is  a  more  local  version  of 
the  policy  to  prefer  plans  with  fewer  state  changes. 

•  Prefer  more  reliable  operators:  If  an  operator  0i  has  a  higher  rate  of  success 
to  number  of  times  that  it  has  been  used  than  operator  O2,  then  prefer  0i  over  O2. 
This  policy  avoids  obtaining  execution  failures  during  the  experiments. 

•  Avoid  unreliable  operators:  If  an  operator’s  rate  of  failure  to  number  of  times 
that  it  has  been  used  is  over  a  user-defined  threshold,  do  not  use  it. 

Binding  Interactions 

During  planning,  the  variables  of  each  operator  are  given  values  by  binding  them  to 
objects  in  the  current  state.  Some  bindings  may  be  preferred  to  others.  For  example,  we 
may  prefer  to  use  in  the  experiments  a  different  machine  than  the  one  that  is  being  used 
in  the  main  plan,  since  the  machine  used  in  the  main  plan  is  probably  all  set  up  for  the 
operation.  Other  objects  may  not  bring  up  such  preferences.  For  example,  if  a  brush  is 
being  used  in  the  main  plan  to  clean  the  metal  burrs  in  the  part  we  may  not  mind  using 
it  if  needed  during  the  experiment  planning.  In  summary,  there  may  be  different  binding 
preferences  for  different  types  of  objects. 

One  interesting  case  in  the  process  planning  domain  is  the  type  part.  Suppose  that 
the  main  goal  is  to  drill  a  hole  of  a  certain  width  and  depth  in  parti.  Now  suppose 
that  the  drilling  operation  fails  because  of  a  missing  precondition,  and  experiments  with 
the  drilling  operator  are  needed.  If  the  experiments  are  done  drilling  parti,  we  may  not 
interfere  with  the  main  goal,  but  we  would  violate  an  implied  goal:  ”Do  not  drill  other 
holes  in  the  part  other  than  the  ones  specified  in  the  goal”.  In  fact,  when  we  specify  a  goal 
to  the  planner  in  this  domain  (and  n;any  others)  many  such  explicit  goals  are  also  desired 
but  too  complex  to  specify.  A  planner  works  by  default  on  building  a  plan  to  achieve 
each  of  its  given  goals,  so  by  default  it  would  not  interfere  with  the  implied  goals.  But 
since  the  experimentation  process  requires  producing  plans  for  other  goals,  such  implicit 
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goals  may  be  violated  by  default.  Notice  that  since  the  implicit  goals  are  not  declared  in 
the  goal  set  of  the  main  problem,  they  are  not  protected  by  the  goal  interaction  policies. 
We  have  addressed  this  problem  through  binding  preferences  as  follows. 

When  a  domain  is  defined,  each  type  of  object  is  assigned  to  one  of  the  following 
classes; 

•  Very  high  protection:  The  instances  of  these  types  that  are  being  used  in  the  main 
plan  are  never  to  be  used  for  the  experiments. 

•  High  protection:  During  experiment  planning,  other  instances  are  preferred  to  in¬ 
stances  of  these  types  that  are  being  used  for  the  main  plan. 

•  Low  protection:  During  experimentation  planning,  other  instances  are  preferred  to 
instances  of  these  types  that  are  being  used  for  the  main  plan,  but  never  prefer 
instances  of  high  or  very  high  protection. 

•  Very  low  protection:  The  instances  of  these  types  can  be  used  any  time  during 
experiment  planning. 

In  the  robot  planning  domain  there  are  only  four  types  of  objects,  classified  as  follows: 

•  High  protection:  boxes 

•  Low  protection:  doors,  keys 

•  Very  low  protection:  rooms 

The  process  planning  domain  is  more  complex,  and  has  33  types  of  objects,  classified 
as  follows: 

•  Very  high  protection:  parts 

•  High  protection:  holding  devices 

•  Low  protection:  machines,  machine  tools,  objects  consumed  during  an  operation. 

•  Very  low  protection:  objects  not  consumed  during  an  operation. 

If  necessary,  the  number  of  degrees  of  protection  may  be  augmented,  but  the  mecha¬ 
nism  would  be  the  same. 

Once  the  protection  classes  have  been  defined,  they  are  used  to  determine  the  policies 
that  EXPO  can  use  for  choosing  bindings.  They  are  the  following: 
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•  Avoid  objects  of  very  high  protection:  Never  use  objects  that  are  used  in  the 
main  plan  and  whose  type  is  classified  as  very  high  protection. 

•  Prefer  objects  of  lower  degree  of  protection:  If  two  objects  used  in  the  main 
plan  are  being  considered  for  binding  the  same  variable,  prefer  the  object  with  a 
lower  degree  of  protection. 

•  Prefer  least  number  of  protected  objects:  If  several  objects  used  in  the  main 
plan  are  being  considered  for  binding  different  variables,  prefer  the  set  of  objects 
that  minimizes  the  total  degree  of  protection. 

In  some  domains,  it  may  be  desirable  to  have  a  policy  to  prefer  bindings  that  were  used 
previously  in  successful  executions  of  the  operator.  For  example,  in  the  process  planning 
domain  it  is  preferable  to  use  a  tool  that  has  worked  previously  with  any  materials,  than 
to  use  a  tool  that  has  not  worked  in  the  past  for  certain  types  of  materials.  This  policy  is 
not  implemented  in  EXPO.  Since  it  is  a  policy  that  applies  to  the  main  planning  process 
as  well,  as  we  explain  next. 

4.4.2  Universal  Policies 

.\ll  the  policies  that  the  user  may  define  for  the  main  planning  task  are  also  applicable  to 
experiment  planning.  These  policies  correspond  to  the  control  knowledge  (be  it  domain 
independent  or  not)  given  to  the  planner  to  be  used  for  decision  making  in  the  domain. 
They  can  be  considered  universal  policies,  since  they  apply  in  both  the  main  and  the 
experiment  search  spaces.  For  example,  we  would  consider  an  experiment  that  uses 
cheaper  materials  than  another  one  to  be  better.  But  the  same  principle  applies  to  any 
two  plans.  The  quality  of  the  experiment  plans  is  determined  in  many  dimensions  by 
these  policies  that  are  to  be  addressed  by  other  more  specific  work  in  plan  quality. 

Experiment  policies  and  universal  policies  may  be  in  conflict.  When  this  is  the  case, 
EXPO  gives  priority  to  universal  policies. 


4.4.3  Experimentation  Strategies 

The  experiment  policies  described  in  the  previous  section  express  different  concerns  that 
an  experimenter  may  consider  to  design  and  choose  experiments.  Some  of  these  policies 
may  be  conflicting,  but  the  experimenter  must  have  some  overall,  global  strategy  that 
determines  which  policies  serve  the  strategy  best. 

In  EXPO,  many  different  strategies  may  be  designed.  In  this  section,  we  describe  two 
strategies  that  illustrate  the  capabilities  of  EXPO  in  this  respect.  The  two  strategies  lie 
in  opposite  sides  of  the  spectrum: 
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•  The  learner-at-heart  strategy.  The  main  concern  in  this  strategy  is  to  acquire 
new  knowledge,  and  cis  such  novel  sutuations  are  preferred  over  ones  already  expe¬ 
rienced,  and  short  experiment  plans  are  preferred  over  longer  ones  that  may  delay 
learning. 

•  The  problem-solver-at-heart  strategy.  The  main  concern  of  this  strategy  is 
to  acquire  new  knowledge  in  order  to  solve  the  problem  at  hand.  Consequently, 
interactions  with  the  main  plan  are  avoided  when  possible,  and  repeating  proven 
solutions  is  preferred  over  trying  new  ones. 

The  learner-at-heart  strategy  is  implemented  using  the  following  policies: 

•  Avoid  deep  nodes 

•  Prefer  shallow  nodes 

•  Avoid  long  plans 

•  Prefer  short  plans 

•  Prefer  unreliable  operators 

The  problem-solver-at-heart  strategy  is  implemented  using  the  following  policies: 

•  Support  main  goal  concord 

•  Avoid  main  goal  protection  violation 

•  Avoid  main  prerequisite  violation 

•  .Avoid  irreversible  operators 

•  Prefer  reversible  operators 

•  Prefer  more  reliable  operators 

•  Avoid  unreliable  operators 

•  Prefer  plans  with  fewer  state  changes 

•  Avoid  plans  with  too  many  state  changes 

•  Prefer  operators  that  minimize  state  changes 

•  Avoid  objects  of  very  high  protection 

•  Prefer  objects  of  lower  degree  of  protection 

•  Prefer  least  number  of  protected  objects 
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4.4.4  Implementation 

Each  policy  is  implemented  in  EXPO  as  a  control  rule  for  PRODIGY.  We  summarize  now 
briefly  their  syntax  and  semantics;  more  details  can  be  found  in  [Minton  et  al.,  1989b]. 

PRODIGY  considers  four  choice  points  during  the  search  process:  which  node  to  ex¬ 
pand,  which  goal  to  achieve,  which  operator  to  use  to  achieve  a  goal,  and  which  bindings 
to  use  to  instantiate  the  variables  of  an  operator.  For  each  type  of  decision,  PRODIGY 
makes  a  choice  using  a  set  of  heuristic  rules  that  recommend  one  candidate  over  another 
one,  to  select  a  candidate  and  not  consider  any  others,  and  to  reject  a  candidate  to  be 
never  considered  again  for  this  decision  point.  The  left-hand  side  of  each  control  rule 
expresses  the  criteria  upon  which  the  recommendation  is  based.  These  criteria  are  de¬ 
scribed  in  terms  of  the  planner’s  meta  state  (the  current  goal,  the  current  state,  etc)  and 
expressed  as  a  special  type  of  predicate  called  a  meta  predicate. 

Appendix  C  contains  all  the  policies  that  are  defined  in  EXPO  as  control  rules  for 
PRODIGY.  This  way  of  implementing  the  policies  is  very  flexible.  Any  new  policies  can  be 
easily  added  as  new  control  rules.  Any  new  strategies  can  be  ezisily  defined  by  choosing  a 
set  of  control  rules.  At  the  same  time,  the  current  implementation  of  policies  as  control 
rules  can  greatly  be  improved.  The  control  rules  in  PRODIGY  2.0  have  limited  capabilities. 
For  example,  good  policies  that  cannot  be  expressed  are  policies  that  would  suspend  a 
search  path  until  a  later  point.  Also,  there  is  no  framework  in  PRODIGY  at  present 
to  shift  attention  to  different  goals  (in  our  case  hypotheses)  changing  the  definition  of 
the  problem,  although  some  efforts  within  the  project  were  in  this  direction  [Kuokka, 
1990).  EXPO  can  benefit  greatly  of  current  ongoing  research  on  control  mechanisms  for 
PRODIGY. 

4.5  Experiment  Execution,  Learning,  and  Recovery 

.After  calibrating  and  prioritizing  the  set  of  hypotheses  with  its  heuristics,  EXPO  tests 
one  hypothesis  after  another  until  it  finds  the  one  which  is  the  missing  condition  of  the 
operator.  For  each  hypothesis,  EXPO  designs  a  pre-experiment  plan  ais  the  previous 
section  described.  Then,  the  plan  is  executed  to  reach  a  state  where  the  experiment  can 
be  carried  out.  If  any  other  failures  are  obtained  during  the  execution,  EXPO  stores 
them  and  comes  back  to  learn  from  them  after  the  cause  of  the  current  failure  under 
study  is  determined. 

If  the  missing  precondition  is  found.  EXPO  adds  it  immediately  to  the  operator's 
precondition  expression.  The  new  operator  is  used  in  any  future  planning.  If  none 
of  the  hypotheses  is  found  to  be  the  missing  condition,  EXPO  notifies  the  user  that  it 
believes  the  operator’s  preconditions  to  be  incomplete  but  that  it  cannot  find  the  missing 
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condition.  Section  5.2  described  the  types  of  missing  conditions  that  cannot  be  learned 
by  EXPO. 

Since  the  missing  precondition  was  the  cause  of  the  failure  obtained  in  the  main  plan, 
its  acquisition  allows  the  planner  to  overcome  that  failure.  Now  the  execution  of  the 
main  plan  may  be  continued.  However,  the  experiments  execution  brought  about  many 
changes  in  the  external  state  since  the  time  when  the  main  plan  was  designed  and  it  may 
now  be  invalid.  EXPO  replans  to  achieve  the  top  level  goals  from  the  current  state  of 
the  world.  Then,  EXPO  continues  with  the  execution  of  this  new  plan  and  continues 
to  watch  for  f2iilures  that  signal  faults  in  the  domain  knowledge  that  it  can  correct  by 
experimentation. 


4.6  Discussion 

TEIREISIAS  [Davis,  1976]  is  a  knowledge  acquisition  system  with  a  similar  technique  to 
expo’s  structural  similarity  heuristic.  TEIREISIAS  used  a  simple  clustering  algorithm 
to  discover  similarity  between  rules.  When  the  user  entered  a  new  rule  that  was  clustered 
together  with  other  rules,  TEIREISIAS  checked  that  the  new  rule  had  the  same  predicates 
in  the  left-hand  side.  If  any  was  missing,  TEIREISIAS  would  warn  the  user  that  it 
believed  that  predicate  should  be  mentioned  n  the  rule.  EXPO  uses  this  structural 
knowledge  to  refine  rules  not  when  they  are  defined,  but  when  they  are  found  to  be 
faulty.  Also,  EXPO  uses  the  heuristic  to  discriminate  among  a  set  of  hypotheses,  which 
TEIREISIAS  never  produced. 

As  described  in  Chapter  2.  the  COAST  system  [Rajamoney,  1988]  has  several  cri¬ 
teria  for  choosing  experiments:  preferring  experiments  whose  observations  can  be  col¬ 
lected  easier,  preferring  experiments  that  are  guaranteed  to  disprove  some  hypothesis, 
and  changing  the  current  state  to  enable  experiments  with  different  observations.  In 
expo’s  implementation  all  the  cost  of  collecting  any  observation  is  considered  the  same, 
but  if  this  were  not  the  case  COAST’s  first  strategy  would  be  helpful.  EXPO  does  have 
the  other  two  strategies,  since  every  experiment  proves  or  disproves  a  hypothesis  and 
every  experiment  causes  changes  to  the  external  world. 

KEKADA  [Kulkarni,  1988]  (described  in  detail  in  Section  2.1.2)  contains  many  heuris¬ 
tics  for  guiding  experimentation  in  scientific  discovery.  Although  EXPO’s  experimenta¬ 
tion  is  geared  to  do  more  mundane  learning,  it  is  worth  comparing  both  systems.  Most 
of  the  heuristics  lead  KEKADA  to  behavior  that  is  similar  to  that  of  EXPO.  Some  of  the 
heuristicts  are  hard  coded  in  EXPO  (PCO.  PCI,  PC4,  PCS,  HG3,  HG8,  HSCl,  HSC2, 
ES4.  PGl,  and  DM8),  others  are  expr  ;ssed  as  strategies  (PC3,  PC7,  EP6,  HM4,  HM5, 
DM1,  DM2,  DM3,  DM5,  DM6.  and  DM7).  EXPO  could  be  expanded  with  some  of 
KEKADA’s  heuristics.  PC2.  PC6.  and  PCS  implement  a  task-handling  mechanism  that 
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EXPO  does  not  have.  HGl,  EPl,  and  ES3  provide  KEKADA  with  class  generalization, 
which  EXPO  does  not  currently  have.  EP7,  ESI,  and  ES2  have  an  exploratory  flavor, 
and  as  such  are  not  appropriate  for  EXPO’s  task-driven  learning.  KEKADA  has  a  mech¬ 
anism  for  switching  from  one  hypothesis  to  another  based  on  confidence  factors  (CF3, 
CF4,  CF5,  and  DM4).  EXPO  sticks  to  one  hypothesis  until  it  is  proven  or  ruled  out,  but 
intelligent  switch  of  attention  would  make  EXPO  more  flexible. 
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Chapter  5 


Methods  for  Learning  by 
Experimentation 

The  previous  chapter  showed  how  a  difference  between  the  system’s  expectations  and  the 
collected  observations  indicates  a  fault  in  the  domain  model.  Differences  are  opportunities 
for  learning,  so  our  system  must  be  able  to  idc.^iify  them,  hypothesize  which  part  of 
the  domain  is  incomplete,  determine  the  particular  fact  that  is  missing  in  the  domain 
model,  and  correct  it  accordingly.  All  these  steps  are  different  depending  on  the  types 
of  failures,  and  the  previous  chapter  described  how  to  determine  that  a  failure  is  caused 
by  an  operator’s  incomplete  preconditions.  We  present  in  this  chapter  a  collection  of 
methods  for  learning  under  different  types  of  failures.  This  collection  is  not  exhaustive, 
but  it  is  indicative  of  how  experimentation  can  be  used  to  learn  new  knowledge  from  the 
environment. 

The  chapter  begins  wi  'i  a  taxonomy  ot  the  types  of  the  facts  that  may  be  missing 
when  the  domain  knowledge  is  incomplete,  which  is  used  as  a  guideline  for  the  presenta¬ 
tion  of  the  methods  in  the  rest  of  the  chapter. 


5.1  Refining  Incomplete  Domain  Knowledge 

Section  3.3.5  described  different  types  of  incompleteness  in  a  planner’s  domain  knowledge. 
Figure  5.1  summarizes  them  and  describes  every  type  in  more  detail.  All  these  facts  may 
be  acquired  by  experimentation.  The  methods  in  this  chapter  describe  how  it  can  be 
done  for  some  of  the  cases,  which  are  highlighted  in  the  figure. 

In  the  first  case,  an  existing  operator  can  be  missing  either  a  condition  or  an  effect. 
The  condition  may  be  a  predicate  or  a  negated  predicate.  Also,  it  can  be  a  simple 
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Figure  5.1:  Domain  knowledge  that  can  be  theoretically  acquired  by  experimentation. 
EXPO  concentrates  on  operator  refinement. 

predicate  or  one  with  a  quantification  over  some  of  its  variables.  A  missing  effect  can 
be  either  in  the  add  list  or  in  the  delete  list.  In  either  case,  it  may  be  unconditional  or 
context  dependent  (i.e.,  when  it  occurs  only  under  certain  state  conditions).  Section  5.2 
describes  a  method  that  can  be  applied  to  acquire  new  conditions  and  effects  of  operators. 

Entire  operators  may  also  be  missing.  If  this  is  the  czise,  several  methods  can  be 
applied  that  form  an  initial  definition  for  the  operator  based  on  existing  ones.  This  is 
done  by  direct  analogy,  or  by  decomposition  in  a  subsequence  of  operators,  or  by  splitting 
existing  operators  under  different  conditions.  There  is  also  the  possiblity  of  probing  the 
environment  by  trying  out  the  available  actions  under  new  conditions.  These  methods 
are  described  in  detail  in  Section  5.3. 

The  operators  can  be  incomplete,  but  the  state  may  also  be  missing  many  types  of 
knowledge.  Certain  types  of  objects  may  be  unknown.  New  instances  of  objects  types 
may  be  encountered  by  the  system.  .Attributes  of  objects  may  be  missing.  New  attributes 
can  be  discovered,  either  in  isolation  or  as  combinations  of  other  attributes.  The  range 
of  a  known  attribute  may  also  be  determined  through  interactions  with  the  environment. 
Finally,  the  value  of  an  object’s  attribute  may  be  found  through  experimentation.  This 
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Icist  Ccise  is  addressed  in  Section  5.4. 

The  methods  presented  in  this  chapter  and  their  implementation  in  EXPO  are  sum¬ 
marized  in  Figure  5.12. 


5.2  More  on  Operator  Refinement 

Section  4.1  described  a  method  for  learning  missing  preconditions  of  operators.  But  an 
operator  can  also  have  an  incomplete  set  of  effects.  Consider  again  the  GRIND  operator 
shown  in  Figure  4.1.  The  operator  is  missing  an  effect:  that  grinding  uses  up  cutting 
fluid,  so  the  machine  does  not  have  cutting  fluid  any  longer.  It  is  also  missing  information 
about  the  surface  finish  of  the  part  after  grinding.  As  it  turns  out,  depending  on  the 
coarseness  of  the  grit  of  the  wheel  the  finish  is  either  rough  or  smooth.  We  show  in  this 
section  how  these  facts  can  be  learned. 

The  method  for  acquiring  missing  preconditions  and  effects  will  be  referred  to  as  the 
Operator  Refinement  Method  (ORM). 

5.2.1  Learning  New  Postconditions 

Our  model  is  still  missing  the  fact  that  a  grinding  operation  uses  up  the  cutting  fluid. 
We  show  now  how  this  new  effect  can  be  learned. 

Suppose  that  our  goal  now  is  to  grind  parti  so  that  it  is  smaller  in  height  and  width. 
This  involves  two  successive  applications  of  the  operator  GRIND,  one  for  each  dimension, 
as  shown  in  Figure  5.2.  For  the  first  grinding  operation,  our  system  would  check  that 
all  the  preconditions  of  GRIND  are  true  in  the  external  world.  Since  this  is  the  case,  it 
continues  planning  by  applying  the  operator.  Then  it  checks  that  the  postconditions  of 
GRIND  are  true  in  the  external  state.  Notice  that  because  the  system  doesn't  know  that 
the  grinder  uses  up  the  fluid  the  internal  state  reflects  this  fact  by  containing  (has-f  luid 
grinders)  after  GRIND  is  applied.  In  the  real  world,  the  fluid  has  disappeared,  but  the 
system  is  not  yet  aware  of  that  fact. 

Before  the  system  tries  to  grind  for  the  second  time,  it  checks  if  the  preconditions 
are  true  in  the  external  world.  It  is  at  this  point  that  it  finds  out  that  the  grinder  has 
no  fluid.  The  only  action  that  was  performed  since  the  fluid  was  last  checked  has  been 
grinding.  The  system  then  concludes  that  one  of  the  effects  of  grinding  is  consuming  the 
fluid  in  the  machine,  and  so  it  modifies  the  delete  list  of  the  operator  grinding. 

But  in  the  general  Ccise.  several  operators  could  have  been  applied  since  the  fluid  in 
the  grinder  was  last  checked.  In  that  case,  experiments  are  needed  in  order  to  determine 


76 


CHAPTER  5.  METHODS  FOR  LEARNING  BY  EXPERIMENTATION 


ismgriiiderl  GRINDER 
is-a  wheell  GRINDING-WHEEL 
hoiding-tool  grinderl  wheell 
boUiiig  grinderl  visel  parti 
has-fluid  grinderl 


is-a  grinderl  GRINDER 
is-a  wheell  GRINDING-WHEEL 
holding-tool  grinderl  wheell 
holding  grinderl  visel  parti 
has-fluid  grinderl 


GRIND(partl.  HHGHT) 


is-a  grinderl  GRINDER 
is-a  wheell  GRINDING-WHEEL 
holding-tool  grinderl  wheell 
grinderl  vise!  parti 
sizentf  parti  HEIGHT  2 
suriiKe-fliiish  parti  sidel  SMOOTH 


T 


is-a  grinderl  GRINDER 
is-a  wheell  GRINDING-WHEEL 
holding-tool  grinderl  wheell 
holding  grinderl  visel  parti 
size-of  parti  HEIGHT  2 
surface-flnish  parti  sidel  SMOOTH 
has-fluid  grinderl 


GRINIXpartl.  WIDTH) 


is-a  grinderl  GRINDER 
is-a  wheell  GRINDING-WHEEL 
holding-tool  grinderl  wheell 
holding  grinderl  visel  parti 
siae-ofpanl  HEIGHT  2 
surface-finish  parti  sidel  SMOOTH 

sizeKif  parti  WIDTH  I 
surface-finish  parti  sidel  ROUGH 


I 


is-a  grinderl  GRINDER 
is-a  wheell  GRINDING-WHEEL 
holding-tool  grinderl  wheell 
holding  grinderl  visel  parti 
size-of  parti  HEIGHT  2 
surface-finish  parti  sidel  SMOOTH 
has-fluid  grinderl 
size-of  parti  WIDTH  1 
surface-finish  parti  sidel  SMOOTH 


External  Slate 
Internal  State 


Figure  5.2;  Finding  new  postconditions  of  grinding 


which  of  those  operators  is  missing  a  postcondition  that  specifies  the  deletion  of  the 
predicate  has-fluid  from  the  state.  The  method  is  summarized  in  Table  5.1. 

This  example  shows  how  to  learn  from  failure  but  the  same  method  can  be  used  for 
learning  from  unexpected  successes.  Notice  that  delete  effects  cam  also  be  learned  from 
this  method,  when  the  condition  F  is  a  negated  predicate. 

The  heuristics  in  Section  4.3  were  described  for  choosing  hypotheses  in  the  case  of 
missing  preconditions,  but  they  can  also  be  used  for  learning  new  effects.  Their  use  and 
implementation  is  different.  The  first  heuristic  applied  is  locality.  EXPO  looks  at  the 
bindings  of  the  candidate  operators  and  selects  operators  that  affected  the  objects  in 
E.  Notice  that  if  the  effect  has  any  wildcard  variables,  E's  objects  do  not  appear  in 
the  bindings  of  the  candidate  operators,  and  so  this  heuristic  is  not  very  helpful.  Next, 
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If  a  precondition  P  is  true  in  the  model,  but  it  is  not  true  in  the  external 
world,  then  one  of  the  operators  applied  after  P  weis  established  in  the  model 
has  a  previously  unknown  postcondition  affecting  P. 

1.  Select  candidate  operators.  The  candidate  set  consists  of  all  operators 
{Oi,  O2,...  0„}  applied  since  the  P  was  last  checked. 

2.  Identify  incomplete  operator.  Formulate  experiments  over  the  candidate 
set.  In  each  experiment,  after  an  operator  is  applied  P  is  checked  in  the 
extem£il  world.  If  cls  a  result  of  an  experiment  with  operator  0,,  P  is 
unexpectedly  changed  in  the  world,  then  O,  is  incompletely  specified. 

3.  Add  P  as  a  new  postcondition  of  operator  O,. 


Table  5.1:  Method  for  learning  new  postconditions 

the  structural  similarity  heuristic  is  applied.  EXPO  looks  in  the  operator  hierarchy  for 
operators  that  have  the  effect,  and  looks  in  neighboring  nodes  for  the  operator  in  the 
list  of  candidates,  and  ranks  them  according  to  their  distance.  When  EXPO  finds  the 
incomplete  operator  (9,  then  it  can  use  the  generalization  heuristic.  It  cannot  be  used 
before  because  EXPO  has  focused  attention  and  only  observes  known  effects  after  the 
execution  of  an  operator.  So  E  was  never  observed  in  previous  executions  of  0.  EXPO 
starts  monitoring  E  and  generalizes  according  to  the  observations  collected.  Through  the 
generalization,  the  objects  in  E  may  be  kept  constant,  generalized  to  operator  variables, 
or  generalized  to  wildcard  variables. 

The  method  described  in  this  section  is  limited  to  observe  only  the  known  conditions 
and  effects  of  each  operator.  It  is  possible  to  learn  new  effects  more  quickly  if  a  larger 
set  of  predicates  is  observed  after  the  execution  of  an  action.  This  would  detect  changes 
in  the  state  immediately  after  the  execution.  However,  limited  observation  capabilities 
is  a  more  realistic  setting  in  domains  where  large  collections  of  data  may  be  observed, 
and  it  is  the  one  chosen  for  EXPO. 


5.2.2  Learning  Conditional  Effects 

Learning  conditional  effects  is  a  mixture  of  learning  new  preconditions  and  new  postcon¬ 
ditions.  But  it  requires  that  the  system  keeps  additional  information  about  the  actions. 

Suppose  that  the  agent’s  goal  is  to  grind  two  parts.  Grinding  part3  changes  the 
surface  condition  just  as  the  system  expects  and  is  shown  in  Figure  5.3(a).  Now  it  is 
trying  to  grind  part4.  After  executing  the  action,  the  effects  of  the  operator  are  checked. 
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At  this  point,  the  system  finds  out  that  the  postcondition  that  specifies  that  grinding 
maJces  the  surface  condition  of  the  part  be  smooth  does  not  always  occur,  as  shown  in 
Figure  5.3(b).  The  system  would  then  detect  the  presence  of  a  conditional  effect.  Now 
it  will  compare  the  state  in  which  the  effect  happened  and  the  state  in  which  it  did  not 
happen.  The  only  difference  in  this  case  is  the  grit  of  the  wheel,  so  it  will  add  that  as  the 
condition  of  the  conditional  effect.  Another  conditioned  effect  can  be  learned  to  account 
for  the  situation  in  which  the  surface  finish  produced  by  grinding  is  not  smooth.  Again, 
if  there  are  several  differences  between  the  states  then  experimentation  would  be  needed 
to  determine  the  relevant  one. 

The  method  is  summarized  in  Table  5.2.  This  example  illustrates  that  the  system  will 
sometimes  encounter  situations  with  a  great  potential  for  learning.  In  this  case,  it  can 
also  learn  about  the  conditional  effect  in  case  of  using  a  wheel  with  coarse  grit,  which  is 
to  produce  a  rough  finish.  Because  the  conditional  expression  associated  with  an  effect 
is  a  concept  to  be  learned,  it  presents  similar  problems  to  precondition  learning  with 
respect  to  the  set  of  hypotheses. 


If  an  effect  of  an  operator  takes  place  in  situation  S/i  but  not  in  situation 
Ss,  then  it  is  a  conditional  effect  of  the  operator. 

1.  Select  candidate  conditions  for  the  effect.  The  candidate  set  A(S4,  Sb) 
is  formed  by  calculating  all  the  differences  between  (the  state  in  which 
the  effect  occurs)  and  Sb  (the  state  in  which  the  effect  does  not  occur). 

2.  Identify  missing  condition.  Formulate  experiments  observing  if  the  ef¬ 
fect  of  the  operator  occurs  when  one  of  the  differences  P  is  true  in  the 
state.  Use  any  information  available  to  formulate  the  most  promising 
experiments  first.  In  absence  of  knowledge,  apply  a  binary  search  to 
isolate  the  precondition  from  A(S.4,  Sb). 

3.  Add  P  cis  a  condition  for  the  conditional  effect  of  the  operator  0. 


Table  5.2:  Method  for  learning  new  conditional  effects 

Let  us  return  for  a  moment  to  our  example.  Let  us  go  back  to  the  point  when  the 
system  encountered  the  situation  in  Figure  5.3(a).  When  the  situation  in  Figure  5.3(b)  is 
found,  and  the  postcondition  does  not  occur,  then  the  system  must  have  a  way  to  retract 
its  knowledge  restricting  the  effect  with  a  condition  that  it  learns  applying  the  method 
for  learning  conditional  effects.  This  example  illustrates  how  the  methods  presented  here 
are  not  completely  independent.  A  framework  must  be  devised  to  allow  the  system  to 
combine  them  and  apply  whichever  one  seems  more  appropriate  at  each  time  as  we  will 
discuss  in  Chapter  7. 
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iM  grinders  CStlNI^BR 
is4  wbeeB  GRINDING-WHEEL 
gtit-of-wheel  wbeeO  FINE-GRIT 
holding-tool  grinderS  wheels 
holding  grinders  viseS  parts 
has-fluid  grinders 


T 


is-a  grinders  GRINDER 
is-a  wheels  GRINDING-WHEEL 
grit-of-wheel  wheels  FINE-GRIT 
holding-tool  grinderS  wheels 
holding  grinders  viseS  partS 
has-fluid  grinderS 


GRINIXpanS,  HEIGHT) 


is-a  grinderS  GRINDER 

is-a  grinderS  GRINDER 

is-a  wheels  GRINDING-WHEEL 

is-a  wheels  GRINDING- WHEEL 

grit-of-wheel  wheeQ  FINE-GRIT 

grit-of-wheel  wheelS  FINE-GRIT 

holding-tool  grinderS  wheels 

holding-tool  grinderS  wheeiS 

holding  grinderS  viseS  partS 

holding  grinderS  viseS  partS 

has-fluid  grinderS 

has-fluid  grinderS 

size-of  parts  HEIGHT  2 

size-of  parts  HEIGHT  2 

surface-finish  partS  side2  SMOOTH 

surface- finish  partS  side2  SMOOTH 

(a) 

iM  griiidet4  QtINraER 

is-a  grinder4  GRINDER 

iso  wheeU  GRINDING-WHEEL 

is-a  wheeN  GRINDING-WHEEL 

grisof-aiieel  wheeW  CX>ARSE-GRIT 

grit-of-wheel  wheeH  CXIARSE-GRTT 

bolding-lcxii  gtinder4  wheeM 

holding-tool  gtinder4  wheel4 

holdiag  gtindet4  viseii  pan4 

holding  grindet4  vise4  pan4 

has-fluid  grindet4 

has-fluid  grinder4 

GRIND(part4.  HEIGHT) 


is-a  gtiader4  GRINDER 

is-a  grinder4  GRINDER 

is-a  WheeM  GRINDING-WHEEL 

is-a  WheeM  GRINDING-WHEEL 

grit-of-wheel  wheeM  COARSE-GRTT 

grit-of-wheel  wheeM  COARSE-GRIT 

holdiiig-lool  grindet4  wheeM 

holding-tool  grinder4  wheeM 

holding  gtinder4  vise4  part4 

holding  grinder4  vise4  patt4 

has-ffaiid  gtiixief4 

has-fluid  grinder4 

sizeofpatM  HEIGHT  2 

size-of  part4  HEIGHT  2 

sufiKe-fiaish  patt4  tideS  ROUGH 

surface-finish  part4  sideS  SMOOTH 

(b) 


External  State 
Internal  State 


Figure  5.3:  Finding  conditional  effects  of  grinding 


5.3  Learning  New  Operators 

There  are  many  ways  to  learn  new  operators.  Our  methods  are  goal-directed:  they  are 
triggered  when  the  planner  finds  itself  in  a  situation  where  it  cannot  solve  a  problem. 
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The  system  assumes  that  the  available  knowledge  is  incomplete  and  tries  the  various 
methods  to  formulate  new  operators.  Learning  is  always  incremental,  preferring  overly 
incomplete  specifications  (that  are  progressively  refined  by  the  ORM)  to  more  detailed 
specifications  that  may  be  incorrect.  None  of  the  methods  is  guaranteed  to  work,  only 
the  external  execution  of  the  new  operators  can  show  if  the  newly  acquired  operator  has 
a  meaning  in  the  domain.  In  this  section,  we  describe  through  examples  different  ways 
of  learning  new  operators  followed  by  a  more  formal  description  of  each  method. 


5.3.1  Direct  Analogy 

New  operators  can  be  learned  by  direct  analogy  with  existing  ones.  As  an  example, 
suppose  that  the  system  has  the  knowledge  about  drilling  holes  shown  in  Figure  5.4(a). 
A  hole  can  be  made  if  a  drill  has  a  high-helix  drill  bit  of  the  size  of  the  desired  hole  and 
some  cutting  fluid,  and  if  it  is  holding  a  part  that  has  a  spot  hole  in  the  appropriate 
location.  Suppose  now  that  the  system  is  given  the  goal  of  producing  a  part  with  a  hole 
in  it,  and  there  are  no  high-helix  drill  bits  available.  The  preconditions  of  the  operator  for 
drilling  cannot  be  achieved,  and  PRODIGY  is  not  able  to  solve  the  problem.  But  instead 
of  returning  a  failure,  our  system  uses  the  following  reasoning  to  derive  a  new  operator 
for  drilling  with  other  types  of  drill  bits  that  might  be  available.  The  system  finds  that 
both  high-helix  and  twist  drill  bits  are  of  the  same  object  type:  DRILL-BIT,  and  thus 
it  creates  the  new  operator  shown  in  plain  font  in  Figure  5.4(b).  The  new  operator  only 
gets  from  the  original  one  the  types  of  the  objects  that  it  is  applied  to,  and  the  effect 
that  it  is  created  for.  Experiments  are  performed  by  executing  the  action  under  different 
conditions  until  a  successful  application  is  found.  We  describe  in  the  next  paragraph 
how  the  experiments  can  be  designed  efficiently.  If  the  new  operator  cannot  be  applied 
successfully,  then  the  process  is  repeated  with  other  types  of  drill  bits.  If  this  does  not 
yield  any  success  either,  then  other  object  types  are  tried.  In  this  case,  a  new  operator 
for  drilling  holes  with  a  milling  machine  is  acquired  when  a  different  type  of  machine  is 
considered.  These  experiments  end  when  a  successful  application  of  a  newly  formulated 
operator  is  found  that  proves  its  existence.  Once  this  happens,  the  ORM  helps  to  locate 
additional  conditions  and  effects  that  are  specific  to  the  new  operator.  They  are  shown 
with  a  star  (*)  in  Figure  5.4(b).  The  method  is  summarized  in  Table  5.3.  Notice  that  the 
power  of  this  method  comes  from  the  possibility  of  relating  P  to  P’  through  the  object 
type  hierarchy. 

Choosing  the  right  experiments  is  an  important  issue  for  making  learning  efficient. 
The  conditions  for  the  experiments  are  guided  by  the  preconditions  and  effects  of  the 
original  operator.  If  there  are  several  operators  for  drilling  that  are  available,  then  exper¬ 
iments  that  involve  the  preconditions  and  postconditions  common  to  all  drilling  opera¬ 
tions  are  preferred.  The  more  available  operators  that  already  contain  information  about 
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(DRILL-WITH-HIGH-HELIX-DRILL 

(preconditions 

(and  (is-a  <machina>  DRILL) 

(is-a  <drill-bit>  HIGH-HELIX-DRILL-BIT) 

(sane  <drill-bit-diaaetar>  <bola-diaBi«tar>) 

(diaBetar-oX-drill-bit  <drill-bit>  «irill-bit-diamater>) 

(has-llnid  <Bachina>  <lluid>  <part>) 

(has-spot  <part>  <hola>  <sida>  <loc-x>  <loc-y>) 

(holding-tool  <machin«>  <drill-bit>) 

(holding  <nachina>  <holding-davic6>  <part>  <sida>))) 

(aflacts  ( 

(del  (is-clean  <part>)) 

(add  (haa-burrs  <part>)) 

(del  (has-spot  <part>  <hole>  <side>  <loc-x>  <loc-y>)) 

(add  (has-hole  <part>  <hole>  <side>  <hole-dapth> 

<hole-di2UBeter>  <loc-x>  <loc-y>))))) 

(a)  An  operator  for  drilling  a  hole  using  a  high-helix  drill  bit 

(DRILL-WITH-TWIST-DRILL 

(preconditions 

(and 

(is-a  <Bachine>  DRILL) 

(is-a  <drill-bit>  TWIST-DRILL-BIT) 

«  (saae  <drill-bit-dianeter>  <hole-dianeter>) 

♦  (diaaeter-of-drill-bit  <drill-bit>  <drill-bit-diaineter>) 

♦  (has-spot  <part>  <hole>  <side>  <loc-x>  <loc-y>) 

♦  (holding-tool  <«achine>  <drill-bit>) 

♦  (holding  <Bachine>  <holding-device>  <p:u:t>  <side>))) 

(effects  ( 

♦  (del  (is-clean  <part>)) 

♦  (add  (has-burrs  <part>)) 

♦  (del  (has-spot  <part>  <hole>  <side>  <loc-x>  <loc-y>)) 

(add  (has-hole  <part>  <hole>  <side>  <hole-depth> 

<hole-diaineter>  <loc-x>  <loc-y>))))) 

(b)  New  operator  for  drilling  with  a  twist  drill  bit.  The  stars  indicate  new  facts 
acquired  by  the  Operator  Refinement  Method  for  the  new  operator. 

Figure  5.4:  Learning  a  new  operator  for  drilling  by  analogy  with  an  existing  one. 

drilling,  the  more  efficient  the  experiments  designed  to  refine  the  new  operator.  Notice 
that  these  are  heuristics  and  they  do  not  make  any  guarantees  about  the  convergence  of 
the  process. 
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If  a  given  problem  cannot  be  solved  by  a  set  of  operators  because  a  pre¬ 
condition  P  that  specifies  the  type  of  an  object  of  an  operator  0  cannot  be 
achieved,  formulate  a  new  operator  by  direct  analogy  with  0  through  P. 

1.  Find  a  related  predicate.  Look  through  the  type  hierarchy  of  the  objects 
in  the  domain  and  find  P’  such  that  it  refers  to  objects  of  the  same  type 
of  the  unachievable  precondition  P. 

2.  Formulate  a  new  operator.  Construct  a  new  operator  O’  with  the  effects 
of  0  that  the  original  problem  subgoaled  on  and  all  the  object  types  of 
0  except  P. 

3.  Experiment  with  the  new  operator.  Execute  the  action.  If  the  desired 
effects  cire  not  obtained,  apply  experimentation  to  isolate  which  of  the 
other  preconditions  of  0  need  to  be  added  to  O’.  If  O’  is  applied  suc¬ 
cessfully  in  some  state,  then  continue  with  step  4.  Otherwise,  go  back 
to  step  1,  either  looking  for  a  different  P’  or  considering  a  different  P. 

4.  Refine  the  new  operator.  Apply  the  ORM  to  find  all  the  preconditions 
and  additional  effects  of  the  new  operator. 

Table  5.3;  Method  for  learning  a  new  operator  by  direct  analogy  with  an  existing  one. 

5.3.2  Micro-operator  Formation 

New  operators  can  also  be  acquired  by  learning  useful  partial  specifications  of  an  existing 
one.  One  possible  way  to  do  this  is  when  the  system  encounters  situations  in  which  only 
some  of  the  effects  of  the  action  are  desired.  If  this  is  the  Ccise  then  experimentation  is 
used  to  find  if  only  some  of  the  preconditions  are  required  for  the  partial  effects  needed. 

Suppose  the  system  has  the  operator  for  cutting  specified  in  Figure  5.5(a).  The 
operator  expresses  that  if  a  circular  saw  has  a  type  of  attachment  called  friction  saw  and 
some  cutting  fluid  and  if  it  is  holding  a  part,  then  the  size  of  the  part  can  be  reduced  and 
the  resulting  surface  is  smooth.  Now  suppose  that  the  system  is  given  a  problem  whose 
goal  is  to  make  the  size  of  a  part  smaller,  and  that  no  fluids  are  available  in  the  initial 
state.  The  goal  cannot  be  achieved  with  the  available  knowledge,  and  yet  there  is  a  way 
to  solve  the  problem.  The  system  formulates  a  new  cutting  operator  that  has  only  the 
effects  that  it  needs  from  the  original  one,  and  only  the  preconditions  that  specify  the 
type  of  the  objects  required  for  the  operator.  The  action  is  then  executed.  If  the  desired 
effect  is  not  obtained,  then  the  system  finds  which  additional  conditions  are  required. 
This  is  done  by  experimenting  with  the  action  applying  it  under  different  situations. 
The  experiments  are  guided  by  the  preconditions  of  the  known  operator  for  cutting. 
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This  process  ends  when  a  successful  application  of  the  new  operator  is  found  (thereby 
proving  its  existence).  This  happens  when  the  desired  effect  is  obtained  in  a  state  where 
not  all  the  preconditions  of  the  original  operator  are  true.  Finally,  the  ORM  is  called  to 
further  refine  the  operator.  The  result  is  a  cutting  operator  without  the  preconditions 
and  effects  that  have  to  do  with  obtaining  a  reasonable  surface  condition  quality  (having 
fluid  on  the  machine),  as  shown  in  Figure  5.5(b).  This  method  for  learning  a  partial 
operator  is  summarized  in  Table  5.4. 

(CUT-HITH-CIRCULAR-FRICTIOI-SAW 

(params  (<Bachina>  <part>  <attachfflant>  <holding-davica>  <dim>  <veaue>)) 

(praconds  (and 

(is-a  <part>  PART) 

(is-a  <machina>  CIRCULAR-SAW) 

(is-a  <attachmant>  FRICTION-SAW) 

(has-Tluid  <machina>  <lluid>  <part>) 

(aiza-ot  <part>  <dim>  <valua-old>) 

(snallar  <valua>  <valua-old>) 

(aida-np-for-machining  <dim>  <sida>) 

(holding-tool  <machina>  <attachmant>) 

(holding  <machina>  <holding-davica>  <part>  <sida>))) 

(alfacts  ( 

(dal  (haa-lluid  <macbina>  <lluid>  <part>)) 

(add  (surfaca-finish-aida  <part>  <sida>  SMOOTH)) 

(add  (aiza-of  <part>  <diin>  <valua>))))) 

(a)  An  operator  for  cutting 

(CUT-TO-SIZE 

(parama  (<Bachins>  <part>  <attachBient>  <holding-davica>  <dim>  <value>)) 

(praconds  (and 

(is-a  <part>  PART) 

(is-a  <machina>  CIRCULAR-SAW) 

(is-a  <attachiBant>  FRICTION-SAW) 

*  (siza-ol  <part>  <dim>  <valuo-old>) 

*  (smallar  <valua>  <valua-old>) 

*  (sida-up-lor-machining  <dim>  <3ida>) 

*  (holding-tool  <nachlna>  <attachmant>) 

*  (holding  <Bachina>  <holding-devico>  <part>  <sida>))) 

(affacts  ( 

(add  (siza-of  <part>  <diin>  <valua>))))) 

(b)  New  operator  for  cutting  to  reduce  the  size.  The  stars  indicate  new  facts  acquired 
by  the  Operator  Refinement  Method  for  the  New  Operator. 

Figure  5.5;  Micro-operator  formation  when  only  some  effects  are  needed. 

A  second  possibility  is  sequencing,  i.e.  to  detect  a  sequence  of  subactions  that  are  cur- 
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When  a  given  problem  cannot  be  solved  by  the  current  set  of  operators 
because  a  precondition  P  of  an  operator  0  cannot  be  achieved,  formulate  a 
new  operator  O’. 

1.  Formulate  a  new  operator.  Construct  a  new  operator  0’  with  the  de¬ 
sired  effect  and  the  type  of  the  objects  in  0. 

2.  Experiment  with  the  new  operator.  Execute  the  action.  If  the  desired 
effects  are  not  obtained,  apply  experimentation  to  isolate  which  of  the 
other  preconditions  of  O  (not  including  P)  need  to  be  added  to  O’.  End 
the  process  when  O’  is  successful  in  a  state  where  the  preconditions  of 
O  are  not  true. 

3.  Refine  the  new  operator.  Use  the  ORM  to  find  additional  preconditions 
and  effects  of  O’. 


Table  5.4:  Method  for  learning  a  new  operator  by  micro-operator  formation 

rently  represented  by  an  operator.  As  an  example,  consider  the  operator  in  Figure  5.6(a) 
used  to  set  up  a  machine  for  performing  a  machining  operation.  The  operator  has  several 
preconditions  that  check  the  availability  of  a  machine,  a  holding  device,  a  tool  and  a  part. 
The  set  up  consists  of  holding  the  tool  in  the  tool  holder,  having  a  holding  device  on  the 
machine,  and  holding  the  part  with  the  holding  device.  Since  a  different  setup  is  used 
for  each  machining  operation,  representing  this  set  of  actions  as  a  single  operator  is  an 
efficient  way  of  expressing  the  configuration  for  the  next  operation.  Now,  suppose  that 
we  want  to  perform  some  manual  operation  on  a  part.  We  cisk  the  system  to  find  a  plan 
to  hold  it.  With  the  available  knowledge,  holding  a  part  is  not  possible  because  there 
are  no  tools  that  can  be  installed  in  the  machine.  But  instead  of  returning  a  failure  our 
system  tries  to  find  if  the  operator  can  be  divided  into  a  sequence  of  actions,  one  of  them 
involving  only  holding  the  part.  The  operator  to  do  the  setup  gives  several  independent 
operators,  shown  in  Figure  5.6(b).  Sequencing  is  done  by  following  the  same  basic  steps 
shown  in  Figure  5.4,  but  in  this  case  additional  operators  are  formulated  with  the  effects 
not  originally  needed. 

The  two  methods  just  presented  for  acquiring  new  operators  by  sequencing  or  by 
partially  specifying  a  given  one  are  engaged  in  a  process  that  we  call  micro-operator 
formation! .  Notice  that  the  original  operator  is  not  discarded  since  it  can  still  be  useful 
to  solve  some  problems  efficiently. 

^These  methods  can  be  thought  of  as  opposite  to  the  formation  of  macro-operators.  However,  learning 
micro-operators  is  not  necessarily  the  reverse  process  because  it  does  not  imply  the  decomposition  of  an 
operator  into  a  set  of  operators. 
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(SETUP 

(pracondit  ions 
(and 

(is-a  <mactiina>  MACHIHE) 

(is-oi-typa  <tool>  MACHIME-TOOL) 

(is-of-typa  <holding-devica>  HOLOIIG-OEVICE) 

( is-availabla-tool-holdar  <nachina>) 

(is-availabla-tool  <tool>) 

( is-availabla-tabla  <machina>) 

(is-availabla-holding-davica  <holding-davice>) 

(has-davica  <Bachina>  <holding-davica>) 

(is-ampty-holding-davice  <holding-dawice>  <machine>) 

(is-claan  <part>) 

(■  (has-burrs  <part>)))) 

(ailacts  ( 

(add  (holding-tool  <machina>  <tool>)) 

(add  (has-davica  <machina>  <holding-devica>) ) 

(add  (holding  <machina>  <holding-davico>  <part>  <side>))))) 

(a)  Operator  to  set  up  a  machine  for  an  operation 

(SETUP-HOLDIIG-DEVICE 

(praconditions 

(is-a  <aachina>  MACHIHE) 

(is-of-typa  <holding-devica>  HOLDIHG-DEVICE) 

(is-availabla-tabla  <machina>) 

(is-availabla-holding-devica  <holding-davice>) ) ) 

(affacts  ( 

(add  (has-davica  <inachine>  <holding-davice>)))) 

(SETUP-HOLD 

(praconditions 

(is-a  <iBachina>  MACHIHE) 

(is-of-typa  <holding-dovica>  HOLDIHG-DEVICE) 

(has-davica  <aachina>  <holding-davica>) 

(is-sapty-holding-davice  <holding-device>  <machine>) 

(is-claan  <part>) 

(■  (has-burrs  <part>)) 

(aflacts  ( 

(add  (holding  <aachina>  <holding-device>  <part>  <3ide>)^))) 

(SETUP-TOOL 

(praconditions 

(and 

(is-a  <aachina>  MACHIHE) 

(is-of-typa  <tool>  MACHIHE-TOOL) 

( is-availabla-tool-holdar  <aachina>) 

( is-availabla-tool  <tool> ) ) ) 

(affacts  ( 

(add  (holding-tool  <machina>  <tool>))))) 

(b)  New  operators  for  different  aspects  of  a  setup 

Figure  5.6:  Micro-operator  formation  by  dividing  an  operator  into  sequential  actions. 
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5.3.3  Learning  New  Operators  by  Splitting  Existing  Ones 

One  way  is  to  refine  an  existing  operator  by  distinguishing  different  aspects  of  the  action 
that  it  represents. 

Shen  [Shen,  1989]  describes  a  method  for  learning  operators  by  splitting  an  existing 
one.  This  method  takes  advantage  of  failures  like  the  one  described  in  Figure  5.2.  When 
the  effects  of  the  operator  do  not  occur  as  expected  we  described  how  to  refine  the 
operator  adding  the  condition  necessary  to  obtain  the  desired  effects.  But  we  can  also 
learn  an  additional  operator  with  the  effects  that  were  observed  instead  of  the  expected 
effects.  For  example  in  the  situation  of  Figure  5.2,  the  system  leaxns  an  operator  for 
grinding  without  fluid  shown  in  Figure  5.7.  The  description  of  this  method  is  shown 
in  Table  5.5.  Since  either  method  can  be  selected  under  the  same  circumstances,  the 
decision  to  be  made  is  if  both  actions  are  interesting  to  the  system. 

(GRIID-WITH-FLUID 

(preconds 

(and 

(is-a  <machin«>  GRINDER) 

(ia-a  <Bhe«l>  GRIIDIHG-WHEEL) 

♦  (haa-fluid  <«achine>) 

(holding-tool  <Bachine>  <Hhe«l>) 

(aida-up-for-nachining  <diB>  <aide>) 

(holding  <Bachin«>  <holding-device>  <part>  <side>))) 

(affacta  ( 

♦  (add  (surlaca-f inish  <part>  <side>  SMOOTH)) 

(add  (siza-of  <part>  <dim>  <value>))))) 

( GRIND- WITHOUT-FLUID 
(praconda 
(and 

(ia-a  <aiachina>  GRINDER) 

(ia-a  <Bhaal>  GRINDING-WHEEL) 

(holding-tool  <machina>  <whael>) 

(aida-up-for-machining  <dim>  <sida>) 

(holding  <Bachine>  <holding-davica>  <pcu^t>  <side>))) 

(elfacts  ( 

(add  (siza-ol  <p2u:t>  <din»  <valua>))))) 


Figure  5.7:  Spli‘ting  the  operator  GRIND  when  effects  are  different. 
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If  after  manipulating  the  world  only  a  subset  E  of  the  effects  of  the  oper¬ 
ator  happen,  then  a  precondition  of  the  operator  is  missing. 

1.  Select  candidate  preconditions.  The  candidate  set  A(So/d,  Scurrent)  is 
formed  by  calculating  all  the  differences  between  the  most  similar  earlier 
state  in  the  previous  problem  solving  history  in  which  0  was  applied  suc¬ 
cessfully  Sold  aJid  the  current  state  Scurrent  (an  unsuccessful  applicat’on 
of  O). 

2.  Identify  missing  precondition.  Formulate  experiments  observing  if  the 
operator  is  successfully  applied  when  one  of  the  differences  r  's  true  in 
the  state.  Use  any  information  available  to  formulate  the  most  promising 
experiments  first.  In  absence  of  knowledge,  apply  a  binary  search  to 
isolate  the  precondition  from  A(Soid,  Scurrent)- 

3.  Substitute  0  by  the  two  new  operators  0i  and  O2.  0i  is  formed  with 
0  and  the  additional  precondition  P.  O2  is  formed  by  the  preconditions 
of  0  and  the  set  of  effects  E. 


Table  5.5:  Method  for  splitting  an  operator  (Shen,  1989). 

5.3.4  Explicit  Expressions 

Another  method  for  splitting  operators  follows  the  same  steps  described  for  learning 
conditional  effects.  Given  the  situation  described  in  Figure  5.3  we  could  obtain  two 
operators  for  grinding  instead  of  learning  new  conditional  effects.  One  would  be  built 
with  the  original  version  with  the  additional  condition  that  the  grit  of  the  wheel  be  fint., 
and  the  additional  effect  that  the  surface  finish  is  smooth.  A  second  operator  would  be 
built  with  the  original  one  plus  the  precondition  that  the  wheel  is  not  of  fine  grit.  The 
result  is  shown  in  Figure  5.8.  The  method  is  summarized  in  Table  5.6. 

Yet  another  possibility  along  this  line  is  to  split  disjunctive  concepts  among  different 
operators.  Suppose  that  using  the  method  for  refining  preconditions  presented  in  Figure 
4.1  we  learn  the  disjunctive  precondition  expression  shown  in  Figure  5.9(a).  To  grind  a 
paio,  we  need  to  hold  it  first,  and  to  do  so  we  need  to  have  some  kind  of  holding  device 
in  he  grinder.  This  operator  repi’e.sents  the  action  of  putting  a  holding  device  in  the 
griitdcr.  The  disjunction  expresses  that  a  grinder  can  use  two  different  holding  devices: 
a  magnetic  chuck  and  a  vise.  But  instead,  we  ^ould  express  the  same  ccace^^*  a.s  two 
different  operators:  one  for  putting  a  vise  in  a  grinder,  and  another  one  for  putting  a 
magnetic  chuck.  The  two  operators  are  expressed  in  Figure  5.9(b). 

Let  us  have  a  closer  look  at  the  last  two  methods  for  learning  new  operators  by  split- 
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(GRIID-WITH-COARSE-GRIT 

(preconds 

(and 


(is-a  <nachin«>  GRINDER) 

(ia-a  <vhe«l>  GRINDING-WHEEL) 

(has-fluid  <Bacliina>) 

(holding-tool  <aachine>  <Hheal>) 
(side-up-for-machining  <dim>  <8ide>) 

(holding  <aachine>  <holding-device>  <part>  <side>) 

*  (grit-of-wheal  <Bheel>  FINE-GRIT))) 

(effects  ( 

♦  (add  (surface-finish  <part>  <side>  SMOOTH)) 

(del  (has-fluid  <machine>)) 

(add  (size-of  <part>  <dim>  <value>))))) 


( GRIND-WITH-NON-COARSE-GRIT 
(preconds 
(2Lnd 

(ia-a  <machine>  GRINDER) 

(is-a  <iiheel>  GRINDING-WHEEL) 

(has-fluid  <machine>) 

(holding-tool  <machine>  <uheel>) 
(side-up-for-machining  <dim>  <side>) 

(holding  <machine>  <holding-device>  <part>  <side>) 

*  (grit-of-uheel  <Bheol>  COARSE-GRIT))) 

(effects  ( 

♦  (add  (surface-finish  <part>  <side>  ROUGH)) 

(del  (has-fluid  <machine>)) 

(add  (size-of  <part>  <dim>  <value>))))) 


Figure  5.8:  Splitting  the  operator  GRIND  according  to  its  conditional  effect 

ting  an  existing  one.  Instead  of  learning  a  new  conditional  effect  for  an  operator,  we 
split  it  into  two  different  operators  using  the  effect  and  its  conditions.  Instead  of  learn¬ 
ing  a  disjunctive  precondition  expression,  we  split  the  preconditions  into  two  different 
operators.  In  both  cases,  what  the  system  is  doing  is  expressing  some  features  of  the 
action  in  the  form  of  several  operators  thereby  representing  more  explicitly  what  other 
methods  already  seen  can  learn.  The  new  operators  represent  information  in  a  different 
but  logically  equivalent  manner.  However,  it  is  important  to  provide  the  system  with  this 
ability  because  it  makes  the  description  of  actions  easier  to  understand.  As  we  mentioned 
in  Section  3.4,  an  action  can  be  represented  by  many  operators,  each  operator  reflecting 
a  certain  aspect  of  the  action.  It  is  our  experience  that  when  the  domain  knowledge  for 
a  planner  is  written,  the  user  expresses  actions  not  in  a  single  complex  operator,  but  in 
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If  an  eifect  E  of  an  operator  takes  place  in  situation  A  but  not  in  situation 
B,  then  it  is  a  conditional  effect  of  the  operator. 

1.  Select  candidate  conditions  for  the  effect.  The  candidate  set  A(Sa,  Sb) 
is  formed  by  calculating  all  the  differences  between  S>i  (the  state  in  which 
the  effect  occurs)  and  Sb  (the  state  in  which  the  effect  does  not  occur). 

2.  Identify  missing  condition.  Formulate  experiments  observing  if  the  ef¬ 
fect  of  the  operator  occurs  when  one  of  the  differences  P  is  true  in  the 
state.  Use  any  information  available  to  formulate  the  most  promising 
experiments  first.  In  absence  of  knowledge,  apply  a  binary  search  to 
isolate  the  precondition  from  A(S>i,  Sb). 

3.  Substitute  0  by  the  two  new  operators  0i  and  O2.  0i  is  formed  with 
0  adding  the  additional  precondition  P  and  the  effect  E.  O2  is  formed 
by  the  preconditions  of  0  and  the  effects  of  0  excluding  E. 


Table  5.6:  Method  for  splitting  operators  according  to  conditional  effects 
several  simpler  and  more  detailed  operators  that  are  easier  for  humans  to  understand. 

5.3.5  Learning  New  Operators  by  Probing  the  Environment 

Another  way  to  create  operators  is  to  start  with  an  empty  description  of  the  action  and 
try  it  out  in  the  external  world  and  observe  the  changes  that  are  produced.  In  this  case, 
the  system  would  learn  a  new  action  from  null  knowledge  about  it.  This  is  very  common 
in  systems  that  explore  the  environment,  and  so  they  often  try  actions  to  learn  about 
their  capabilities  [Shen,  1989j.  We  call  this  method  probing,  and  is  shown  in  Table  5.7. 
The  most  important  part  of  the  method  is  what  to  perceive  in  order  to  notice  the  effects 
of  the  action  and  its  conditions,  yet  not  requiring  that  the  system  collects  all  possible 
observations.  A  set  of  predicates  P  is  chosen,  to  direct  the  system’s  attention.  First  the 
predicates  in  P  are  observed,  then  the  action  is  executed,  and  finally  the  predicates  in  P 
are  observed  again.  Whatever  changes  are  observed  in  any  P’  in  P  are  included  as  effects 
of  the  new  operator.  If  no  changes  are  observed,  a  new  set  of  predicates  is  tried  and  the 
process  is  iterated.  If  the  action  still  doesn’t  seem  to  change  the  environment,  then  the 
system  tries  to  change  the  state  by  applying  other  known  actions  and  iterate  the  process 
again.  For  example,  suppose  that  we  are  exploring  an  action  that  pushes  the  drill  spindle 
over  the  drill  table.  The  drill  spindle  raises  again  after  the  action  of  pushing  is  stopped.  If 
there  is  no  part  on  the  table,  the  environment  remains  unchanged.  Executing  the  action 
in  a  new  state  when  there  is  a  part  on  the  table  will  yield  observations  of  changes  in  the 
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(PUT-HOLDIIG-DEVICE-II-GRIHDER 

(preconds 

(and 

(is-a  <machina>  GRIHDER) 

*  (or  (is-a  <liolding-devica>  NIGIETIC-CBUCK) 

♦  (is-a  <liolding-dsvics>  VISE)) 

(is-availabls-table  <machins>) 
(is-availabls-holding-d«vics  <holding-device>) ) ) 

(effects  (  (add  (has-device  <machine>  <bolding-device>))))) 

(a)  Disjunction 


(PUT-MAGIETIC-CHUCK-II-GRIHDER 

(preconds 

(and 

(is-a  <Bachine>  GRIHDER) 

♦  (is-a  <holding-device>  HAGHETIC-CHUCK) 

(is-available-table  <machine>) 
(is-available-holding-device  <holding-device>) ) ) 

(effects  (  (add  (has-device  <machine>  <holding-device>))))) 

(PUT-VISE-II-GRIHDER 

(preconds 

(and 

(is-a  <machine>  GRIHDER) 

*  (is-a  <holding-device>  VISE) 

(is-available-table  <nachine>) 
(is-available-holding-device  <holding-device>) ) ) 

(effects  (  (add  (has-device  <machine>  <holding-device>))) )) 

(b)  Explicit  disjunction 

Figure  5.9:  Splitting  an  operator  by  a  disjunction 


external  state. 

5.4  Learning  New  Facts  about  the  State 

Even  when  a  system  has  perfect  knowledge  about  the  operators  of  its  task  domain  it 
might  be  impossible  to  solve  some  problems  without  the  ability  to  interact  with  the 
environment.  The  internal  state  might  not  contain  all  the  data  about  the  world  needed 
to  plan.  Some  missing  data  can  be  acquired  by  direct  observation,  like  the  color  of  an 
object  within  the  visual  field.  Other  observations  require  planning.  For  example,  in 
order  to  observe  the  color  of  an  object  in  a  distant  room  we  first  have  to  plan  how  to  get 


5.4.  LEARNING  NEW  FACTS  ABOUT  THE  STATE 


91 


When  there  is  am  available  action  with  no  corresponding  operator,  probe 
the  action  aind  try  to  find  a  model  of  the  action. 

1.  Choose  what  to  observe.  Choose  a  set  of  predicates  P  to  observe.  Col¬ 
lect  observations. 

2.  Execute  the  action.  Then,  observe  all  predicates  in  P  again.  Make  the 
effects  of  the  operator  be  the  subset  of  predicates  P’  in  P  that  changed. 
If  no  changes  are  observed,  either  go  back  to  step  1  or  change  the  world 
by  performing  known  actions  and  then  go  to  step  1. 

3.  Refine  the  new  operator.  Apply  the  operator  refinement  method  to  find 
additional  preconditions  and  effects  of  the  new  operator. 


Table  5.7:  Method  for  probing  available  actions  to  learn  new  operators 

there.  But  perception  and  planning  might  not  be  enough  to  collect  information  about 
a  situation,  and  experimentation  may  be  the  only  way  to  acquire  some  facts  about  the 
state  of  the  world. 

Consider  for  example  the  observation  of  the  lock  status  of  a  door.  This  is  not  directly 
observable  by  looking  at  the  door.  Yet  we  can  design  an  experiment  to  collect  this  obser¬ 
vation  as  follows.  Since  the  predicate  (unlocked  <door>)  is  one  of  the  preconditions  of 
the  operator  OPEN,  we  can  design  an  experiment  to  try  to  open  the  door.  If  the  door  is 
unlocked,  then  all  the  conditions  of  OPEN  are  true  and  the  door  will  open.  If  the  door 
is  locked,  then  OPEN  will  fail.  The  experiment  used  a  special  version  of  OPEN  that  is 
missing  the  unknown  predicate  in  the  preconditions 

Other  observations  need  a  more  complicated  experimentation  process.  For  example, 
consider  a  domiiin  where  an  agent  can  carry  objects  of  weight  smaller  than  its  own.  A 
simplified  description  of  the  knowledge  necessary  is  presented  in  Figure  5.10.  Suppose 
that  the  agent  does  not  know  its  own  weight.  Since  this  observation  is  absolutely  neces¬ 
sary  to  solve  any  problems  involving  carrying  objects,  the  system  engages  in  the  process 
of  acquiring  this  particular  piece  of  data  through  experimentation. 

To  do  so,  it  experiments  with  the  action  of  carrying  different  objects  and  see  if  it  can 
carry  them  or  not,  as  shown  in  Figure  5.11.  The  weight  of  the  objects  is  a  controllable 
parameter  that  is  chosen  as  part  of  the  design  of  the  experiments  and  depends  on  the 
availability  of  the  objects.  A  special  version  of  the  operator  is  used  in  the  experiments, 
constructed  by  dropping  the  preconditions  which  correspond  to  the  unknown  and  its 
relationship  with  the  controllable  variable.  In  our  example  they  correspond  to  the  weight 
of  the  robot  and  the  predicate  smaller-theui.  When  the  action  succeeds,  then  the 
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(CARRY-OBJECT 

(preconditions 

(and 

(ara-empty  <robot>) 

(next-to  <robot>  <obj>) 

(vaight-of  <obj>  <obj-Beight>) 

(vaight-of  <robot>  <robot-Baight>) 

(saallar-than  <obj-Height>  <robot-Haight>))) 

(affects  ( 

(del  (axa-aapty  <robot>)) 

(dal  (naxt-to  <robot>  <obj>)) 

(dal  (next-to  <*otbar-obj>  <obj>)) 

(add  (holding  <obj>))))) 

Figure  5.10:  Operator  for  carrying  objects  of  smaller  weight  than  the  agent. 


preconditions  of  CARRY-OBJECT  are  true,  including  the  relationship  in  question.  Each 
experiment  collects  new  data  about  this  relationship,  constraining  more  the  possible 
values  of  the  unknown  variable.  Determining  the  value  of  a  parameter  doing  binary 
search  over  its  possible  values  is  a  well  known  experimentation  method,  and  the  process 
eventually  converges  to  a  value  of  the  maximum  weight  that  the  agent  can  carry,  which 
is  equal  to  its  own.  Notice  that  this  is  different  from  situations  where  we  need  to  know 
the  value  of  an  attribute  that  is  deducible  from  observations  whose  acquisition  requires 
planning.  Here,  we  are  describing  a  more  complicated  process  in  which  the  system  needs 
to  engage  with  experimentation  strategies. 


5.5  Notes  on  Other  Types  of  Imperfect  Knowledge 

This  thesis  addresses  the  problem  of  acquiring  knowledge  in  incomplete  domains.  As  we 
mentioned  in  Section  3.3,  other  types  of  imperfections  require  additional  mechanisms. 
We  point  out  why  in  this  section. 

5.5.1  Refining  Incorrect  Knowledge 

Incorrect  postconditions  can  be  corrected  in  a  very  straight  forward  way.  Since  the  system 
always  observes  the  effects  of  an  operator  immediately  after  applying  an  action,  it  can 
detect  the  effects  that  are  incorrect  because  they  will  not  be  true  in  the  external  world. 
Effects  that  appear  only  sometimes  should  be  considered  conditional  effects. 
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OBSERVATIOIS  COLLECTED: 


(s«ight-of  <obj>  <Beight>) 

CARRY-OBJECT  succeeded? 

ramge  of  <robot-Heiglit> 

(vaight-ol  objactl  2) 

y 

Cl.?) 

(seight-of  objact2  100) 

n 

[1,100) 

(weight-ot  objects  60) 

y 

[60,100) 

(weight-ol  objact4  75) 

n 

[50,76) 

(eeight-ol  objects  62) 

y 

[62.75) 

(weight-ol  objects  69) 

n 

[62. 69) 

(weigbt-ol  object?  66) 

n 

[62,65) 

(weight-of  objects  62) 

n 

[62,62) 

RESULT:  (w«ight-of  ROBOT  62) 


Figure  5.11:  Gathering  data  from  the  state  by  directed  experimentation.  Repeated 
execution  of  the  operator  with  objects  of  different  weight  uncovers  the  weight  of  the 
robot. 

Detecting  and  removing  incorrect  preconditions  from  the  operators  requires  mecha¬ 
nisms  additional  to  the  ones  described  above.  The  preconditions  describe  the  cleiss  of 
states  where  the  action  can  be  applied.  If  the  preconditions  are  incorrect,  they  are  over¬ 
specific.  This  implies  that  the  operator  will  be  only  applied  to  a  subset  of  the  class  of 
states  where  the  action  can  be  executed.  The  system  would  need  additional  mechanisms 
that  allow  it  to  consider  an  incorrect  operator  applicable  even  if  some  of  its  preconditions 
are  not  matched. 

The  presence  of  incorrect  knowledge  might  be  detected  by  introspection  if  it  yields 
inconsistencies.  Experimentation  could  be  used  to  determine  the  source  of  the  inconsis¬ 
tencies  and  the  necessary  corrections. 


5.5.2  Learning  with  an  Inadequate  Domain  Model 

The  attributes  known  to  the  system  might  not  be  enough  to  describe  the  state  of  the 
external  world.  New  attributes  can  be  discovered  from  the  environment  when  the  system 
detects  that  the  given  attributes  are  not  sufficient  to  discriminate  between  situations  that 
produce  different  results.  Shen  [Shen,  1989]  presents  a  method  to  discover  new  attributes. 
Another  problem  arises  when  the  predicates  used  to  represent  the  attributes  are  missing 
certain  parameters  that  are  important.  We  will  suppose  in  our  work  that  the  system  is 
given  the  necessary  attributes. 
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Attributes  observable  in  the  world  can  be  combined  to  deduce  new  attributes  that  are 
not  directly  observable.  For  example,  the  material  of  an  object  and  its  size  determine  its 
weight.  Combinations  of  attributes  are  functional  constructs.  Learning  these  constructs 
requires  providing  the  system  with  some  basic  constructs  that  it  can  combine  to  find  the 
right  expressions  for  calculating  the  values  of  the  derived  attribute.  For  example,  consider 
our  model  for  grinding.  The  operators  are  still  incomplete  because  they  do  not  contain 
any  information  about  the  fact  that  they  can  only  be  applied  when  the  dimensions  of 
the  part  become  smaJler  (and  not  bigger).  If  a  situation  arises  when  grinding  is  applied 
with  that  purpose,  we  could  detect  using  experimentation  strategies  that  there  is  an 
relationship  between  the  predicates 

(size-of  <part>  <dim>  <valu6>) 

(si2«-of  <p2u:t>  <diin>  <value-old>) 

that  is  relevant  for  grinding  and  that  should  appear  in  the  preconditions. 

In  fact  the  correct  precondition  expression  to  be  learned  in  this  Ccise  would  contain: 

(aiza-of  <part>  <diB>  <valu«>) 

(aiza-ol  <part>  <dia>  <valu«-old>) 

(sfflallar  <valua>  <valua-old>) 

Learning  these  expressions  is  an  issue  that  discovery  systems  address  and  is  beyond 
the  scope  of  this  work. 


5.5.3  Learning  in  Intractable  Domain  Models 

Intractability  arises  when  control  knowledge  is  missing.  Control  knowledge  avoids  plan¬ 
ning  inefficiencies.  But  in  some  crises,  planning  failures  may  be  caused  by  unknown 
interactions  among  operators  because  the  system  is  missing  the  control  knowledge  that 
represents  those  interactions. 

A  method  for  learning  control  rules  by  experimentation  is  described  in  [Carbonell  and 
Gil,  1990].  The  method  consists  of  detecting  goal  interactions  when  the  system  observes 
that  an  action  undoes  a  previously  achieved  subgoal.  A  lot  of  research  has  been  done  on 
learning  control  rules  by  other  methods  [Minton  et  al.,  1989a;  Laird  et  al.,  1986;  Mostow 
and  Bhatnagar,  1987],  but  learning  control  knowledge  from  experience  may  prove  to  a 
very  powerful  approach. 
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5.6  Summary 

Figure  5.12  presents  a  summary  of  all  the  methods  described  in  this  chapter.  Notice  that 
the  method  determines  the  type  of  knowledge  acquired.  Each  method  is  triggered  by  a 
certain  type  of  failure. 

All  the  methods  in  Figure  5.12  have  been  implemented  in  EXPO  to  demonstrate  the 
feasibility  of  learning  by  experimentation.  They  are  triggered  when  EXPO  detects  a  lack 
of  domain  knowledge,  but  the  subsequent  experimentation  process  is  simulated  manually. 
The  full  experimentation  process  (as  we  described  in  Chapter  4)  is  implemented  only  for 
learning  new  preconditions  and  new  effects.  Empirical  tests  on  this  implementation  are 
described  in  detail  in  the  next  chapter. 


WHAT  IS  LEARNED 

WHEN  IT  IS  LEARNED 

new  preconditions 

when  an  action  fails  but  it  succeeded  before, 
some  unknown  precondition  was  true  before  and 
is  not  true  now. 

when  an  observation  contradicts  information 
in  the  internal  state,  some  action  was  executed 
that  had  unknown  effects 

new  conditional 
effects 

when  an  expected  effect  only  occurs  sometimes 
after  an  action  is  executed 

new  operators 
analogy 
splitting 

conditional 

effects 

disjunction 

microoperators 

formulate  operator  by  analogy  with  a  known  one 
when  an  action  fails  but  it  succeeded  before 
learn  one  operator  for  each  outcome 
when  an  effect  only  occurs  sometimes,  learn  an 
operator  for  each  case  (when  effect  occurs 
and  when  it  does  not) 

make  disjunction  explicit  having  several  operators 
when  only  some  effects  are  wanted,  build  partial 
operator 

attribute  values 

when  needed  to  plan:  observe,  infer,  and  plan 
if  needed.  Design  observations  if  several 
are  needed. 

Figure  5.12;  EXPO’s  methods  for  refining  incomplete  domain  knowledge.  EXPO  can 
acquire  new  preconditions,  effects,  operators,  and  attribute  values. 
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Chapter  6 

Empirical  Results 


Given  any  learning  method,  it  is  important  to  demonstrate  its  effectiveness,  i.e.,  that  it 
can  indeed  be  used  to  acquire  new  knowledge.  In  many  cases,  the  efficiency  of  learning 
(the  time  spent  acquiring  new  knowledge)  is  also  a  main  concern.  This  chapter  presents 
empirical  measurements  that  demonstrate  the  effectiveness  and  efficiency  of  the  meth¬ 
ods  for  learning  by  experimentation  described  in  this  thesis.  The  first  section  contains 
results  that  show  the  effectiveness  of  EXPO  as  it  learns  to  refine  the  domain  operators. 
And  more  importantly,  we  show  that  the  new  versions  of  the  operators  are  useful  for 
the  problem  solver.  The  second  section  demonstrates  EXPO’s  efficiency.  Our  learning 
methods  are  very  directed,  and  the  experiments  actually  performed  are  geared  towards 
testing  the  most  promising  hypotheses.  This  translates  into  an  efficient  use  of  time  and 
other  resources  of  concern. 

EXPO  implements  the  techniques  for  learning  by  experimentation  presented  in  Chap¬ 
ters  5  and  4.  The  baseline  planner  is  the  PRODIGY  system  described  in  Section  3.6.  EXPO 
was  not  tested  interacting  with  a  physical  environment,  but  with  a  software  system  that 
simulates  one.  The  details  of  this  simulation  are  described  in  Section  3.4.2. 

The  results  presented  in  this  chapter  correspond  to  two  different  domains.  One  do¬ 
main  is  the  large  and  complex  model  of  process  planning  described  in  Appendix  B.  The 
other  one  is  a  simpler  robot  planning  domain,  presented  in  full  length  in  Appendix  A. 


6.1  Effectiveness 

The  results  presented  in  this  section  confirm  that  learning  by  experimentation  is  a  useful 
technique  to  acquire  new  domain  knowledge.  By  useful  we  mean  that  whatever  is  learned 
is  needed  in  order  to  solve  a  task  (i.e.,  a  given  set  of  problems).  Notice  how  this  differs 
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from  other  work  on  learning  from  the  environment  [Shen,  1989],  where  the  focus  is  more 
on  exploring  and  on  learning  what  is  unknown  about  the  external  world  be  it  useful  or 
not. 

We  Wtuit  to  control  the  degree  of  incompleteness  of  a  domain  in  the  tests.  We  have 
available  a  complete  domain  D  which  has  all  the  operators  with  all  their  corresponding 
conditions  and  effects.  Only  c  conditions  and  e  effects  are  learnable  by  EXPO.  With  this 
complete  domain,  we  can  artificially  produce  domains  D'  that  have  certain  percentage  of 
incompleteness  (i.e.,  20%  of  the  preconditions  are  missing)  by  removing  preconditions  or 
effects  from  D  randomly.  We  will  use  f)prec20  '•o  denote  a  domain  that  is  incomplete  and 
is  missing  20%  of  the  c  learnable  conditions.  i?po4t20  ^  domain  missing  20%  of  the  e 
postconditions.  Notice  that  EXPO  never  has  access  to  D,  only  to  the  incomplete  domain 
D'. 

EXPO  learns  new  conditions  and  effects  of  incomplete  operators.  What  is  a  good 
measure  of  the  ampunt  of  new  knowledge  acquired  by  EXPO?  As  we  described  in  Section 
3.3,  an  incomplete  domain  may  cause  plan  execution  failures.  Consider  the  case  when 
an  operator  0  is  missing  a  condition  p.  Now  suppose  that  we  want  to  execute  0  in 
state  S.  If  p  happens  to  be  true  in  5  then  the  execution  will  be  successful,  since  p  is 
a  necessary  condition  of  0.  But  if  p  is  not  true  in  5,  then  the  execution  of  0  will  fail. 
This  means  that  missing  preconditions  can  cause  execution  failures.  Notice  that  after 
EXPO  learns  that  p  is  a  condition  of  O,  the  problem  may  be  solved  (if  subgoaling  on  the 
unsatisfied  new  precondition  P  yields  a  subplan  to  achieve  P  and  the  rest  of  the  plan 
does  not  yield  any  execution  failures.)  If  knowledge  is  sufficiently  complete  then  a  plan 
is  always  successfully  executed.  If  knowledge  is  incomplete  then  a  plan  is  not  necessarily 
successfully  executed.  Thus,  an  increment  in  the  number  of  successful  executions  of  plans 
after  learning  is  indicative  of  the  amount  of  new  preconditions  acquired. 

Now  consider  a  case  where  an  operator  O  is  missing  the  postcondition  (add  (P)). 
If  we  apply  O  in  state  5  where  P  is  not  true,  P  will  continue  not  to  be  true  after  O 
is  applied.  Sometime  later,  we  may  need  P  to  be  true  (e.g.,  if  it  is  a  condition  of  a 
subsequent  operator).  The  system  believes  P  to  be  false,  and  after  checking  the  external 
world  it  finds  out  that  P  is  true.  Incorrect  predictions  of  literals  trigger  learning  to 
acquire  new  effects  (in  this  case  (add  (P))  for  O).  After  learning,  P  is  always  predicted 
to  be  true  after  applying  O.  thus,  a  reduction  in  the  number  of  incorrectly  predicted 
literals  is  indicative  of  the  amount  of  new  effects  acquired. 

We  generated  n  problems  randomly.  All  of  the  n  problems  were  solvable  within  the 
time  bound  that  PRODIGY  was  given.  From  the  set  of  n  solvable  problems,  we  randomly 
chose  m  of  them  to  be  the  training  set.  The  rest  constituted  the  test  set.  Notice  that  both 
sets  are  independent  (they  do  not  have  any  common  instances).  Initially,  PRODIGY  is 
given  the  incomplete  domain  and  EXPO  starts  running  the  training  problems.  For  each 
problem,  EXPO  obtains  a  plan  from  PRODIGY  and  tries  to  execute  it  in  the  simulated 
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environment.  EXPO  examines  any  expectation  failures  and  applies  the  methods  for 
learning  by  experimentation  described  in  this  thesis.  The  more  failures  encountered 
during  training,  the  more  opportunities  for  learning.  At  certain  points  during  learning, 
we  run  the  test  set.  Learning  is  turned  off  at  test  time,  so  when  a  failure  is  found  the 
internal  state  is  corrected  to  reflect  the  observations  but  no  learning  occurs. 

In  the  robot  planning  domain,  there  were  60  training  problems  and  12  test  problems, 
taken  from  previous  work  in  PRODIGY  [Minton,  1988].  We  ran  tests  with  20%  and 
50%  missing  preconditions.  missing  12  preconditions,  and  f^precso  ‘s  missing 

28.  Figures  6.1(a)  and  6.2(a)  show  the  number  of  failures  that  EXPO  detects  during 
training  with  ^precso  r^'spectively.  Figures  6.1(b)  and  6.2(b)  show  how  many 

solutions  for  problems  in  the  test  set  were  successfully  executed  with  L)prec2o  ^precso 
respectively.  The  number  of  plans  that  PRODIGY  is  able  to  execute  correctly  increases 
with  learning.  This  is  because  the  problems  in  the  training  set  cause  expectation  failures, 
which  EXPO  uses  to  gain  new  knowledge  after  undergoing  experimentation. 

For  Dp,j^20  EXPO  has  not  examined  enough  failures  to  acquire  all  domain  knowledge, 
but  it  has  acquired  the  knowledge  necessary  to  execute  successfully  the  solutions  to  all 
the  problems  in  the  test  set.  For  only  4  solutions  to  the  test  problems  are 

executed  successfully.  This  is  because  the  training  set  does  not  contain  problems  that 
cause  failures  that  yield  the  knowledge  necessary  to  overcome  the  execution  failures  in 
the  test  set.  After  training  with  the  test  set,  one  more  new  condition  is  learned  which 
turns  out  to  be  the  common  cause  of  the  execution  failures  in  the  test  set  and  thus  the 
solutions  to  all  the  test  problems  can  be  successfully  executed. 

In  the  process  planning  domain,  there  were  two  sets  of  training  and  test  problems. 
Each  training  set  had  100  problems,  and  each  test  set  had  20  problems.  The  problems 
were  generated  randomly,  as  we  explain  in  Appendix  B.  The  tests  were  run  in  domains 
with  10%  and  .30%  incompletene.ss.  Figures  6.3  and  6.4  present  results  for  and 

Dprjcso  respectively  when  EXPO  acquires  new  preconditions.  The  curves  show  results 
very  similar  to  the  results  obtained  for  the  robot  planning  domain 

As  an  example  of  what  is  learned.  EXPO  refines  the  operator  GRIND  shown  in  Figure 
4.1  addins  the  facts  shown  with  a  star  (*)  in  Figure  6.5. 

We  also  run  tests  with  domains  where  20%  and  .50%  of  the  postconditions  of  operators 
were  missing.  Figures  6.6  and  6.7  show  the  results  for  Dpo,(20  ^post50  respectively 
in  the  robot  planning  domain.  As  more  failures  are  encountered,  EXPO  acquires  new 
effects  of  operators.  Thus,  the  number  of  incorrect  predictions  when  running  the  test  set 
is  reduced  continuously. 
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(a) 


number  of  milling  probtenis 

(b) 

Figure  6.1:  Effectiveness  in  the  robot  planning  domain  with  20%  of  the  preconditions 
missing  (i^p,ec2o)-  (a)  Cumulative  number  of  failures  in  the  execution  of  solutions  to  train¬ 
ing  problems  encountered  by  EXPO  cis  the  size  of  the  training  set  increases.  Each  failure 
presents  an  opportunity  for  learning,  (b)  The  number  of  plans  successfully  executed  in 
the  test  set  inv’reases  as  EXPO  examines  more  failures.  The  number  of  additional  plans 
successfully  executed  is  indicative  of  the  amount  of  knowledge  acquired  by  EXPO. 
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number  of  craiiiing  problems 


(b) 

Figure  6.2:  Effectiveness  in  the  robot  planning  domain  with  50%  of  the  preconditions 
missing  (I>p,ec5o)-  (^)  Cumulative  number  of  failures  in  the  execution  of  solutions  to  train¬ 
ing  problems  encountered  by  EXPO  as  the  size  of  the  training  set  increases.  Each  failure 
presents  an  opportunity  for  learning,  (b)  The  number  of  plans  successfully  executed  in 
the  test  set  increases  as  EXPO  examines  more  failures.  The  number  of  additional  plans 
successfully  executed  is  indicative  of  the  amount  of  knowledge  acquired  by  EXPO. 
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training  problems 


(b) 

Figure  6.3:  Effectiveness  in  the  process  planning  domain  with  10%  of  the  preconditions 
missing  (*'■)  Cumulative  number  of  failures  in  the  execution  of  solutions  to  train¬ 

ing  problems  encountered  by  EXPO  as  the  size  of  the  training  set  increases.  Each  failure 
presents  an  opportunity  for  learning,  (b)  The  number  of  plans  successfully  executed  in 
the  test  set  increases  as  EXPO  examines  more  failures.  The  number  of  additional  plans 
successfully  executed  is  indicative  of  the  amount  of  knowledge  acquired  by  EXPO. 
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(a) 


Figure  6.4:  Effectiveness  in  the  process  planning  domain  with  30%  of  the  preconditions 
missing  (/?^ec3o)-  (^)  Cumulative  number  of  failures  in  the  execution  of  solutions  to  train¬ 
ing  problems  encountered  by  EXPO  as  the  size  of  the  training  set  increases.  Each  failure 
presents  an  opportunity  for  learning,  (b)  The  number  of  plans  successfully  executed  in 
the  test  set  increases  as  EXPO  examines  more  failures.  The  number  of  additional  plans 
successfully  executed  is  indicative  of  the  amount  of  knowledge  acquired  by  EXPO. 
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(GRIID 

(preconditions 

(emd 

(is-a  <Bacliine>  GRIHDER) 

(is-a  <tool>  GRIIDING-WREEL) 

(is-a  <part>  PART) 

*  (is-clean  <part>) 

*  ("  (has-burrs  <part>)) 

*  (bas-lluid  <Bachine>) 

('  (same  <diB>  DIAMETER)) 

(holding-tool  <aachine>  <tool>) 

(side-up-for-machining  <diB>  <side>) 

(holding  <Bachine>  <holding-device>  <p2u:t>  <side>))) 

(effects  ( 

*  (del  (is-clean  <part>)) 

*  (add  (has-bnrrs  <part>)) 

*  (del  (has-fluid  <Bachine>)) 

*  (del  (surface-finish  <part>  <side>  <s-q>)) 

(del  (size-of  <part>  <dim>  <value-old>)) 

(add  (size-of  <part>  <dim>  <value>))))) 

Figure  6.5:  A  More  Complete  Model  of  Grinding 

6.2  Efficiency 

The  previous  section  showed  that  EXPO  is  indeed  able  to  acquire  new  knowledge  through 
experimentation.  So  the  techniques  presented  in  this  thesis  are  effective  in  that  they  do 
lead  EXPO  to  the  cause  and  repair  of  the  failures  it  encounters.  But  this  is  not  the 
only  desirable  property  of  this  type  of  learning.  In  fact,  as  we  discussed  in  Chapter  4, 
minimising  the  number  of  experiments  is  another  important  concern.  This  section  takes 
a  close  look  at  the  efficiency  of  the  experimentation  process. 

Figures  6.8  and  6.9  present  the  number  of  experiments  that  are  required  to  recover 
from  the  failures  shown  in  Figures  6.1(a)  and  6.2(a)  respectively.  The  heuristics  used  are 
represented  by  a  letter:  g  for  generalization,  s  for  structural  similarity,  and  /  for  locality. 
Without  any  of  our  hypothesis-selection  heuristics,  many  experiments  are  needed.  The 
other  curves  show  how  effective  each  heuristic  is  individually  and  in  combination  with 
others.  Each  heuristic  contributes  in  its  own  way  to  reducing  the  number  of  experiments. 
Notice  that  although  the  divide  and  conquer  experimentation  does  a  smaller  number  of 
experiments  than  some  of  the  heuristics  used  in  isolation,  every  experiment  requires  a 
larger  number  of  goal  statements  to  satisfy,  as  explained  in  Section  4.2.  For  20%  incom¬ 
pleteness,  the  three  heuristics  combined  yield  the  best  results.  For  50%  incompleteness, 
gl  is  about  as  good  as  gls.  This  is  because  when  the  operators  are  very  incomplete  similar 


6.2.  EFFICIENCY 


105 


(a) 


mmber  of  oiininK  proUems 

(b) 

Figure  6.6:  Acquisition  of  new  effects  in  the  robot  pitinning  domain  with  20%  of  the 
effects  missing  (a)  Cumulative  number  of  failures  in  the  execution  of  training 

problems  encountered  by  EXPO  as  the  size  of  the  training  set  increases.  Each  failure 
presents  an  opportunity  for  learning,  (b)  The  number  of  incorrectly  predicted  liter2ils  in 
the  test  set  decreases  as  EXPO  examines  more  failures.  This  is  indicative  of  the  amount 
of  new  effects  of  operators  acquired  by  EXPO. 
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(a) 


(b) 

Figure  6.7:  Acquisition  of  new  effects  in  the  robot  planning  domain  with  50%  of  the 
effects  missing  (a)  Cumulative  number  of  failures  in  the  execution  of  training 

problems  encountered  by  EXPO  as  the  size  of  the  training  set  increases.  Each  failure 
presents  an  opportunity  for  learning,  (b)  The  number  of  incorrectly  predicted  literals  in 
the  test  set  decreases  as  EXPO  examines  more  failures.  This  is  indicative  of  the  amount 
of  new  effects  of  operators  acquired  by  EXPO. 
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Figure  6.8:  Given  number  of  experiments  that  are  necessary  with  all  the  com¬ 

binations  of  the  three  hypotheses-selection  heuristics:  generalization  of  experience  {g), 
locality  (/),  and  structural  similarit)'  (s).  The  number  of  experiments  needed  is  greatly 
reduced  when  the  three  of  them  are  used. 

operators  may  be  missing  the  same  conditions,  so  s  is  not  very  helpful.  The  effectiveness 
of  s  improves  ais  new  knowledge  is  added  to  the  domain,  this  can  be  seen  in  the  numbers 
of  the  last  rows  of  the  tables  presented  next. 

The  following  tables  show  the  numerical  results  that  are  summarized  in  Figures  6.8 
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Figure  6.9:  Given  number  of  experiments  that  are  necessary  with  all  the  com¬ 

binations  of  the  three  hypotheses-selection  heuristics:  generalization  of  experience  (g), 
locality  (/),  and  structural  similarity  (s).  The  number  of  experiments  needed  is  greatly 
reduced  when  the  three  of  them  are  used. 
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and  6.9.  The  number  of  experiments  needed  with  each  combination  of  heuristics  is 
shown  for  each  failure.  Also  shown  is  the  number  of  experiments  needed  if  no  heuristics 
cU'e  used,  which  corresponds  to  the  ranking  by  default  of  the  missing  condition  in  the 
list  of  hypotheses.  The  last  column  shows  the  total  number  of  hypotheses  in  the  set  of 
candidates. 

With 
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number  of  experiments 

total  number  of 
of  hypotheses 

gls 

gs 

Is 

g 

/ 

s 

no  heuristics 

2 

1 

17 

2 

9 

5 

18 

80 

81 

2 

3 

18 

2 

53 

2 

18 

43 

82 

18 

6 

50 

16 

26 

1 

41 

2 
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2 

1 
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42 

14 
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14 

39 

52 
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21 

18 
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1 
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17 
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9 
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17 

16 
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1 

31 

5 
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16 
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2 

16 
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19 
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5 
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7 

25 

66 

1 

1 

1 

14 

1 

10 

61 

41 

74 

1 

1 

1 

41 

1 

40 

40 

38 

81 

1 

1 

2 

4 

1 

12 

15 

50 

87 

2 

7 

2 

6 

7 

19 

8 

70 

86 

5 

4 

5 

4 

5 

6 

15 

28 

73 

Let  us  examine  this  last  table  more  closely,  and  look  at  the  effects  of  each  heuristic  in 
the  ranking  of  candidate  conditions.  As  we  pointed  out  before,  the  heuristic  of  structural 
similarity  is  increasingly  more  effective,  since  the  operators  in  the  hierarchy  become 
more  complete  through  learning.  The  predicate  (inroom  <key>  <room>)  is  added  as 
a  new  precondition  of  LOCK  in  the  7th  failure  (row  7  in  the  table),  and  also  as  a 
new  precondition  of  UNLOCK  in  the  16th  failure  (row  16).  In  the  7th  failure,  the 
similarity  heuristic  does  not  find  similar  operators  with  this  condition,  so  it  ranks  it 
low.  In  the  16th  failure,  LOCK  is  found  very  close  to  UNLOCK  in  the  hierarchy  and 
it  has  the  precondition  (inroom  <key>  <room>),  so  this  candidate  is  ranked  high.  The 
generalization  of  past  experience  also  becomes  more  effective  when  more  executions  of 
the  operators  are  examined.  Row  14  corresponds  to  the  new  precondition  (arm-empty) 
of  the  operator  PICKUP-OBJ.  Notice  that  the  new  precondition  does  not  have  any  of 
the  parameters  of  the  operator,  and  as  a  result,  the  locality  heuristic  ranks  this  candidate 
very  low. 

In  summary,  the  combination  of  the  three  heuristics  (generalization  of  experience, 
structural  similarity,  and  locality)  reduces  dramatically  the  number  of  experiments  re¬ 
quired,  and  yileds  the  best  performance.  A  divide  and  conquer  strategy  over  the  set  of 
candidates  requires  many  more  experiments  that  also  have  more  complex  setups. 


Chapter  7 


Conclusions 


and  Future  Work 


This  chapter  summarizes  the  contributions  and  limitations  of  this  thesis,  and  outlines 
some  areas  of  future  work. 


7.1  Summary  of  the  Approach  and  Results 

The  thesis  presents  a  general  framework  and  an  effective  and  efficient  approach  to  the 
practical  implementation  of  learning  by  experimentation.  The  methods  presented  are  do¬ 
main  independent,  and  do  not  require  any  knowledge  other  than  the  domain  defined  by 
the  user  for  planning.  The  thesis  shows  that  it  is  possible  to  recover  from  knowledge-level 
impasses  autonomously  without  need  of  causal  explanations  of  the  failure.  Automated 
learning  by  experimentation  is  a  desirable  capability  of  autonomous  systems  and  it  re¬ 
lieves  humans  of  much  work  in  the  engineering  of  knowledge,  taking  over  the  burden 
of  ensuring  knowledge  completeness  and  maintenance  once  an  initial  knowledge  base  is 
constructed.  This  thesis  presents  a  step  in  that  direction. 

The  work  in  this  thesis  is  applicable  to  a  wide  range  of  planning  problems  in  which 
the  following  items  are  feasible: 

•  discrete-valued  features  describe  the  state  of  the  world. 

•  actions  are  axiomatizable  as  deterministic  operators  in  terms  of  the  features  that 
describe  the  state. 

•  reliable  observations  are  available  on  demand. 

•  noise-free  sensors. 
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•  no  other  agents  are  present  whose  actions  interfere  with  the  planner’s. 

Future  work  includes  extensions  in  all  these  areas,  and  is  discussed  in  Section  7.3. 


7.2  Contributions 

The  theoretic3il  contributions  described  in  this  thesis  are: 

•  A  closed-loop  integration  of  planning  and  learning  from  the  environment  by  exper¬ 
imentation  where  new  knowledge  is  immediately  incorporated,  tested,  and  used  by 
the  planner 

•  Systematic  augmentation  of  a  given  incomplete  domain  by  directed  experimenta¬ 
tion,  triggered  each  time  that  there  is  a  knowledge  impasse 

•  Acquisition  of  domain  knowledge  of  a  planner  so  it  is  able  to  solve  problems  it  could 
not  solve  before  learning 

•  Computationally  effective  methodology  for  correcting  incomplete  domain  knowl- 
puge 

•  Exploration  of  methods  for  learning  by  experimentation,  including  hypothesis  gen¬ 
eration,  filtering,  prioritization,  and  empirical  validation. 

•  Domain-independent  heuristics  for  finding  relevant  hypotheses 

•  Efficient  and  customizable  experimentation  control  strategies  maximizing  conver¬ 
gence  on  identification  of  missing  knowledge 

•  A  framework  for  the  interaction  between  the  main  planning  space  and  the  experi¬ 
mentation  planning  space 

expo’s  implementation  of  the  above  presents  the  following  practical  contributions: 

•  An  implementation  that  demonstrates  the  synergistic  interactions  between  a  plan¬ 
ning  system  and  an  active  learner  that  acquires  domain  knowledge  from  the  envi¬ 
ronment 

•  An  empirical  evaluation  of  methods  with  various  degrees  of  initial  incompleteness 
in  the  domain,  and  with  different  sets  of  experimentation  heuristics  to  identify  the 
sources  of  power  and  extensibility  of  the  approach. 

•  Multi-domain  generality  (robot  planning  and  complex  process  planning) 
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7.3  expo’s  Limitations  and  Future  Work 

This  section  describes  the  limitations  of  this  thesis  and  some  suggestions  for  future  work. 
The  section  is  organized  under  three  major  areas;  the  specific  methods  for  learning  by 
experimentation,  the  interaction  with  the  environment,  and  the  global  framework  for 
experimentation. 

7.3.1  Extensions  to  the  Learning  Methods 

expo’s  current  implementation  for  learning  new  preconditions,  described  in  Chapter  4, 
is  limited  to  acquiring  a  new  conjunct  which  is  an  observable  predicate.  Every  member 
of  the  set  of  candidate  new  preconditions  is  an  observable  predicate.  EXPO  considers 
as  hypotheses  only  the  members  of  that  set,  and  tests  them  through  experiments.  If  the 
experiments  show  that  none  of  the  predicates  in  the  set  is  a  new  precondition,  EXPO 
gives  up  on  acquiring  the  precondition  autonomously:  it  notifies  the  user  that  it  knows 
that  the  operator  is  missing  a  precondition  and  that  it  cannot  find  it.  EXPO  considers 
only  the  inclusion  of  additional  conjunctive  predicates  (the  most  common  and  useful 
scenario).  Other  possible  hypotheses  to  be  considered  as  candidate  conditions  are: 

•  Disjunctive  expressions  of  predicates 

•  Inferred  predicates  deduced  from  a  state  by  theorem  proving  (or  other  inferential 
processes) 

•  Quantified  expressions  of  some  piedicates 

•  Predicates  that  are  never  observed  because  they  were  not  needed  for  planning  before 
(i.e.,  the  weight  of  a  box) 

•  A  functional  relation  of  several  predicate  arguments 

EXPO  examines  the  hypotheses  produced  by  the  method.  If  the  experiments  show 
that  the  missing  condition  is  not  one  of  them,  then  it  should  consider  the  above  possibil¬ 
ities.  However,  to  simplify  the  implementation,  EXPO  abandons  learning  and  continues 
plan  execution. 

Using  a  more  sophisticated  concept  learning  algorithm  for  generalization 3  would  ex¬ 
pand  expo’s  capabilities  to  acquire  expressions  other  than  conjunctive  ones,  including 
disjunctions  and  quantified  expressions.  Functional  relations  between  predicate  argu¬ 
ments  require  an  algorithm  with  the  capability  to  construct  new  functions,  such  as  BA¬ 
CON  [Langley  et  al.,  1987]. 
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Learning  preconditions  that  are  inferred  or  unobserved  predicates  is  an  open  research 
question.  EXPO  could  expand  its  set  of  hypotheses  to  inferred  and  unobserved  predicates, 
and  deduce  or  observe  their  value  during  the  experiments.  This  solution  would  be  very 
inefficient  because  a  large  number  of  predicates  may  belong  to  this  group. 

EXPO  assumes  an  initially  incomplete  knowledge  base,  but  many  other  types  of 
imperfections  are  possible,  as  described  in  Section  3.3.  The  domain  knowledge  can  be 
incorrect,  inadequate,  or  intractable.  Section  5.5  outlined  some  possibilities  to  address 
these  different  types  of  imperfections. 

We  described  in  Chapter  5  how  experimentation  is  needed  to  collect  observations 
from  the  state.  When  we  can’t  observe  directly  if  a  door  is  locked  or  unlocked,  we  can 
experiment  on  opening  it  and  we  know  immediately  the  answer.  Robotics  systems  may 
benefit  enormously  from  using  this  capablity  of  experimentation. 

Expanding  the  system’s  vocabulary  by  learning  new  features  about  objects  in  the 
state  is  an  open  area.  [Shen,  1989]  addressed  this  problem  in  the  LIVE  system,  which 
could  detect  hidden  features  and  learn  their  value.  Research  on  constructive  learning 
is  expanding  horizons  in  this  direction,  and  the  area  of  autonomous  learning  from  the 
environment  should  benefit  from  that. 

In  short,  whereas  this  dissertation  makes  a  subst.x.ntial  contribution  to  learning  by 
experimentation,  there  is  a  vast  open  spa -  e  of  additional  research  topics  in  proactive 
experimentation. 

7.3.2  Interaction  with  the  Environment 

The  work  in  this  thesis  has  a  limited  form  of  interaction  with  the  environment.  The 
assumption  of  noise-free  sensors  allows  the  algorithms  to  count  on  reliable  feedback,  but 
it  is  not  a  very  realistic  assumption  for  some  domains.  Work  on  inductive  learning  from 
noisy  data  could  be  applied  if  sensors  were  unreliable.  Experience  on  robotics  research 
leads  us  to  believe  that  this  is  not  a  simple  problem. 

The  presence  of  other  agents  that  can  change  the  environment  and  inadvertently  may 
cause  the  internal  state  to  diverge  from  the  external  world.  Their  differences  would  cause 
failures  that  are  not  due  to  a  f.ailt  in  the  knowledge  base.  A  solution  to  the  problem 
of  determining  the  cause  of  divergence  could  be  a  more  sophisticated  credit-assignment 
system  for  failures.  Nondeterminism  in  the  actions  would  cause  a  similar  problem. 

Learning  by  experimentation  autonomously  from  the  environment  is  not  as  direct  for 
many  applications  outside  planning.  Other  intelligent  systems  are  focused  on  tasks  where 
the  interaction  with  the  environment  is  expensive,  impractical,  or  simply  impossible  to 
obtain.  Medical  diagnosis  .systems  are  a  good  example.  However,  it  is  conceivable  to 
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use  expo’s  strategies  in  such  systems  to  produce  experiments  that  would  translate  into 
questions  for  an  expert,  or  a  request  for  additional  data  gathering. 


7.3.3  Toward  a  Framework  for  Learning  by  Experimentation 

Figure  7.1  summarizes  the  framework  for  learning  from  the  environment  by  experimen¬ 
tation  presented  in  this  thesis.  Given  a  goal,  a  plan  to  achieve  it  is  executed  while  the 
external  environment  is  monitored.  Any  differences  with  the  internal  state  are  detected 
by  various  methods  that  suggest  a  type  of  fault  in  the  domain  knowledge  that  may  have 
caused  the  expectation  failure.  The  methods  also  construct  a  set  of  concrete  hypothe¬ 
ses  to  repair  the  fault.  After  being  heuristically  filtered,  one  hypothesis  is  tested  at  a 
time  with  an  experiment.  After  the  experiment’s  requirements  are  designed,  a  plan  is 
constructed  to  achieve  the  situation  desired.  After  the  execution  of  the  plan  and  the  ex¬ 
periment,  observations  are  collected  to  conclude  if  the  experiment  was  successful  or  not. 
Upon  success,  the  hypothesis  is  confirmed  and  the  domain  knowledge  is  adjusted.  Upon 
failure,  the  experimentation  process  is  iterated  until  success  or  until  no  more  hypotheses 
are  left  to  be  considered.  This  framework  has  shown  to  be  an  effective  way  to  address 
experimentation  but  also  raises  many  issues. 

The  learning  methods  are  not  completely  independent,  and  may  be  triggered  by  the 
same  failure.  For  example,  suppose  that  a  known  effect  of  an  operator  does  not  occur 
upon  execution.  This  triggers  two  methods  that  suggest  different  adjustments  to  the 
domain  knowledge:  either  the  effect  of  the  operator  is  conditional,  or  a  precondition 
is  missing.  Another  example  of  the  strong  interaction  between  methods  is  raised  by  a 
problem  that  the  planner  cannot  solve.  It  may  be  unsolvable  because  an  existing  operator 
is  incomplete  (i.e.,  missing  an  effect)  or  because  the  domain  is  missing  one  operator,  or 
simply  be  unsolvable  regardless  of  completeness  of  knowledge.  A  framework  to  address 
the  interdependencies  of  the  methods  is  needed.  One  method  is  chosen  to  be  the  first, 
and  if  the  experiments  do  not  uncover  the  knowledge  fault  the  other  method  is  tried. 
This  issue  suggests  that  intelligent  shift  of  attention  would  be  very  advantageous. 

In  fact,  intelligent  shift  of  attention  is  necessary  at  all  levels  of  the  experimentation 
process,  as  shown  in  Figure  7.1.  If  the  current  hypothesis  (general  or  particular)  has  taken 
enough  time,  another  hypothesis  may  be  chosen  for  consideration.  If  no  satisfactory  plan 
is  found  for  an  experiment,  the  experiment  design  may  be  changed.  And  if  a  reasonable 
amount  of  time  and  resources  have  been  spent  on  studying  a  failure,  the  study  may  be 
suspended  and  continued  in  the  future  when  more  information  becomes  available. 

Learning  from  the  environment  is  a  necessary  capability  for  autonomous  intelligent 
agents  that  must  solve  tasks  in  the  real  world.  This  thesis  presents  a  step  towards  the 
autonomous  refinement  of  knowledge  through  experimentation. 
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Figure  7.1:  Toward  a  framework  for  learning  by  experimentation.  Failures  in  the  exe¬ 
cution  of  a  plan  trigger  learning.  A  general  cause  for  the  failure  is  hypothesized,  then 
instantiated  to  a  particular  hypothesis.  The  design  of  experiments  includes  planning  the 
experimental  setup.  A  flexible  framework  for  experimentation  would  include  intelligent 
shift  of  attention  at  all  levels  of  the  process,  as  indicated  by  the  arrows  on  the  right  of 
the  figure. 


Appendix  A 

The  Robot  Planning  Domain 


This  appendix  describes  the  robot  planning  domain  implemented  in  the  PRODIGY  archi¬ 
tecture  used  for  examples  and  empirical  tests  in  this  thesis. 

First,  it  includes  a  description  of  a  domain  and  a  quantitative  and  qualitative  char¬ 
acterization.  Then,  the  implementation  of  this  domain  in  the  PRODIGY  architecture  is 
listed.  The  rest  of  the  appendix  includes  the  incomplete  versions  and  problems  used  in 
the  empirical  tests,  and  the  numerical  results  obtained  that  were  used  in  the  graphs  for 
Chapter  6. 

This  domain  was  chosen  to  test  EXPO  because  of  its  realistic  description  of  a  robot 
task,  its  medium  size,  and  because  it  has  been  used  extensively  for  testing  other  learning 
methods  [Minton,  1988;  Etzioni,  1990;  Knoblock,  1991;  Perez  and  Etzioni,  1992].  The 
domain  is  essentially  the  same  used  in  these  references,  except  that  variable  types  have 
been  added  to  the  preconditions.  PRODIGY  needs  generators  for  every  variable,  and  the 
original  domain  used  the  predicates  in  the  conditions  as  such.  If  a  predicate  that  is  a 
generator  is  missing  from  the  preconditions,  PRODIGY  could  not  use  the  operators  for 
planning. 


A.l  Description  of  the  Domain 

This  domain  is  an  extension  of  the  one  used  for  STRIPS  [Fikes  and  Nilsson,  1971].  In 
the  original  domain,  a  robot  could  move  between  rooms  and  transport  boxes.  In  this 
domain,  the  robot  can  also  open  and  close  doors,  and  if  it  is  holding  the  right  key  it  can 
lock  and  unlock  doors.  Boxes  are  carriable  or  pushable,  and  all  keys  are  carriable.  Boxes 
and  keys  are  objects.  Only  carriable  objects  may  be  held  by  the  robot  for  transportation, 
other  objects  must  be  pushed  to  be  moved.  The  actions  available  are:  pickup  an  object. 
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put  down  and  object,  put  down  an  object  next  to  another  one,  push  an  object  to  a  door, 
push  an  object  through  a  door  to  another  room,  go  through  a  door  to  another  room,  go 
next  to  a  door,  push  an  object,  go  next  to  an  object,  and  open,  close,  lock,  and  unlock 
a  door. 

The  domain  can  be  qualitatively  and  quantitatively  described  as  follows; 

•  Some  quantitative  features  are; 

-  There  are  14  operators. 

—  There  are  11  predicates;  connects ,  carriable,  pushable,  is-room,  is-object, 
is-door,  is-key,  dr-to-rm,  inroom,  next-to,  holding.  Only  3  of  them 
(inroom,  naxt-to,  holding)  are  changed  by  the  operators. 

—  There  are  four  types  of  variables;  object,  room,  door,  and  key. 

-  The  average  number  of  parameters  for  an  operator  is  2. 

—  The  average  number  of  preconditions  of  operators  is  4. 

-  The  average  number  of  effects  of  operators  is  4. 

-  57  preconditions  and  38  effects  are  learnable,  a  total  of  95  le'arnable  items. 

•  All  the  operators’  effects  are  reversible. 

•  There  are  no  inference  rules  for  deducing  new  facts  about  a  given  state. 

•  Tb‘''"e  are  no  functions  that  compute  the  value  of  a  predicate. 

•  The  precondition  expression  of  all  the  operators  is  a  conjunction  of  predicates  that 
are  included  in  the  state  (i.e.,  are  not  to  be  derived  or  computed  through  a  function, 
as  explained  in  Section  3.6.). 

•  There  are  no  negations  in  the  precondition  expressions. 

•  All  effects  of  all  operators  are  unconditional,  i.e.,  their  occurence  is  not  dependent 
on  the  context  given  by  the  state  at  application  time  (as  explained  in  Section  3.6). 

All  preconditions  that  are  not  type  specifications  are  learnable  by  EXPO.  The  type 
specifications  must  be  present  in  the  operator  as  generators  for  PRODIGY  2.0,  the  version 
of  the  system  on  which  EXPO  is  implemented  (for  more  details  on  generators  see  [Minton 
et  ai,  1989b]).  However,  this  is  not  a  deficiency  of  EXPO,  but  of  PRODIGY  2.0,  one  that 
is  being  corrected  in  later  versions  of  the  system  [Veloso,  1989;  Carbonell  et  al.,  1992). 

Only  the  effects  used  for  backchaining  are  not  learnable  by  EXPO.  The  recison  for  this 


A.J.  DESCRIPTION  OF  THE  DOMAIN 


119 


is  that  an  operator  must  be  used  by  the  planner  in  order  for  EXPO  to  observe  the 
outcomes  of  its  execution.  When  operators  are  written  by  a  human,  they  express  an 
action  or  change  in  the  world,  so  it  is  reasonable  to  assume  that  the  operators  initially 
given  to  EXPO  have  some  effect. 
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A. 2  Domain  Operators 


(PICIUP-OBJ 
(puraaa  (<ebj«ct>)) 

(praconds  (and 
(an-Mipty) 

(n«zt-to  robot  <objact>) 

(io-objoct  <objact>) 

(carrinblo  <objact>))) 

(affects  ( 

(dol  (nra-oapty)) 

(dol  (noxt-to  <objoct>  <oothar-ab30>)) 
(dol  (nozt-to  <*othar~ob31>  <objact>)) 
(add  (holding  <obJact>))))) 

(PUTDOn 

(paraaa  (<objact>)) 

(praconda  (and 

(holding  <obJact>) 

(is-objoet  <objact>))) 

(affacts  ( 

(dal  (holding  <aothar-ob36>)) 

(add  (noxt-to  robot  <objact>)) 

(add  (ana-aapty))))) 

(puToaHB-iErr-To 

(paraas  (<obJact>  <othar-ob>  <rooB>)) 
(praeonds  (and 

(holding  <objaet>) 

(it-objaet  <objact>) 

(is-objact  <othar-ob>) 

(inrooa  <othar*ob>  <rooB>) 

(is-rooa  <rooB>) 

(inrooa  <objact>  <rooa>) 

(naxt-to  robot  <othar-ob>) ) ) 

(affacts  ( 

(dal  (holding  <*othar-ob3S>)) 

(add  (naxt-to  <objact>  <othar-ob>)) 
(add  (naxt-to  robot  <abjact>)) 

(add  (naxt-to  <othar-ob>  <objact>)) 
(add  (ara-aapty)))>) 

(POSH-TO-Dt 

(paraas  (<obJact>  <door>  <rooB>)) 
(praeonds  (and 
(is-door  <door>) 

(dr-to-ra  <door>  <rooa>) 

(is-rooa  <rooa>) 

(inrooa  <objact>  <rooa>) 

(is-objsct  <objact>) 

(naxt-to  robot  <obJact>> 

(poshablo  <obJact>>)>)) 

(affacts  ( 

(dsl  (naxt-to  robot  <aothar-ob3>) ) 

(dal  (naxt-to  <objact>  <apthar-ob5>) ) 
(dal  (naxt-to  <aothar-obl3>  <objact>)) 
(add  (naxt-to  <obJact>  <door>)) 

(add  (naxt-to  robot  <obJact>) )  >) ) 


(PUSH-THRU-DB 

(paraas  (<obJact>  <door>  <rooB>  <othar-rooa>)) 
(praeonds  (and 
(is-rooa  <rooB>) 

(dr-to-ra  <door>  <rooa>) 

(is-door  <door>) 

(dr-opan  <door>) 

(naxt-to  <obJact>  <door>) 

(noxt-to  robot  <objact>) 

(is-obJact  <abjact>) 

(pashabla  <objact>) 

(connacts  <door>  <rooB>  <othar-roaa>) 
(is-rooa  <othar-rooB>) 

(inrooa  <objact>  <othar-rooB>))) 

(affacts  ( 

(dal  (naxt-to  robot  <aothar-obi>) ) 

(dol  (naxt-to  <objact>  <*othar-obl2>)) 

(dal  (naxt-to  <aothar-ob7>  <objact>)) 

(dol  (inrooa  robot  <aothar-ob21») 

(dal  (inrooa  <objact>  <aothar-ob22>)) 

(add  (inrooa  robot  <rooB>)) 

(add  (inrooa  <objact>  <rooB>)) 

(add  (naxt-to  robot  <objact>))))) 

(OO-THRO-DR 

(paraas  (<door>  <rooB>  <othar-rooa>>> 

(praeonds  (and 
(ara-sapty) 

(is-rooa  <rooB>) 

(dr-to-ra  <door>  <roon>) 

(is-door  <door>) 

(dr-opan  <door>> 

(naxt-to  robot  <door>) 

(connacts  <door>  <rooB>  <athsr-rooB>) 
(is-rooa  <othar-rooa>) 

(inrooa  robot  <othar-roon>) ) ) 

(offsets  ( 

(dsl  (naxt-to  robot  <*othar-obl9>) ) 

(dal  (inrooa  robot  <*othsr-ob20>> ) 

(add  (inrooa  robot  <rooB>))))) 

(CARRT-THRU-DR 

(paraas  (<objsct>  <door>  <rooB>  <othar-rooB>)) 
(praeonds  (and 
(is-rooa  <rooa>) 

(dr-to-ra  <door>  <rooB>) 

(is-door  <door>) 

(dr-opan  <door>) 

(is-objsct  <objaet>) 

(holding  <objoct>) 

(connects  <door>  <roon>  <othar-rooB>) 
(is-rooa  <othsr-rooB>) 

(Inrooa  <objact>  <othar-rooa>) 

(inrooa  robot  <othar-rooa» 

(nsxt-to  robot  <door>))) 

(offsets  ( 
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(dal  (aast-to  robot  <aothor-ob48>)> 

(dal  (iarooa  robot  <aothor~ob41>) ) 

(dal  (iarooai  <obJact>  <aothar-ob42>) ) 
(add  (iarooa  robot  <rooa>>) 

(add  (Iarooa  <obJact>  <rooa>>)))) 

(OOTO-Dt 

(paraas  (<door>  <rooa>)) 

(pracoads  (aad 
(is-door  <door>) 

(dr-to-ra  <door>  <rooa>) 

(iarooa  robot  <rooa>) 

(ia-rooa  <rooa>))) 

(affacts  ( 

(dal  (aaat-to  robot  <aothar~obl8>)) 

(add  (aaxt-to  robot  <door>))))) 

(PUSH-BOX 

(paraas  (<abjact>  <othtr-ob>  <rooa>)) 
(pracoada  (aad 

(ia-objaet  <obJact>) 

(ia-objact  <othar-ob>) 

(iarooa  <othar-ob>  <rooa>) 

(ia-rooa  <rooa>) 

(iarooa  <objact>  <rooa>) 

(puahabla  <objact>) 

(aaxt-to  robot  <objact>))) 

(affacta  ( 

(dal  (aaxt-to  robot  <aethar-obl4>)) 

(dal  (aaxt-to  <obJact>  <aothar-ob6>) ) 
(dal  (aaxt-to  <aothar-ob0>  <obJact>)) 
(add  (aaxt-to  robot  <obJaet>)) 

(add  (aaxt-to  robot  <othar'ob>)) 

(add  (aaxt-to  <obJaet>  <othar-ob>)) 

(add  (aaxt-to  <othar-ob>  <objaet>) ) ) )) 

(GOTO-OBJ 

(paraiw  (<obJact>  <rooa>)) 

(pracoada  (aad 

(ia-objaet  <objact>) 

(iarooa  <obJact>  <rooa>) 

(ia-rooa  <rooa>) 

(iarooa  robot  <rooa>))) 

(affacta  ( 

(add  (aaxt-to  robot  <objact>)) 

(dal  (aaxt-to  robot  <aothar-obl09>) ) ) ) ) 

(OPEl 

(paraaa  (<door>)) 

(pracoada  (aad 
(ia-door  <door>) 

(ualockad  <door>) 

(aaxt-to  robot  <door>) 

(dr-cloaad  <door>))) 

(affacta  ( 

(dal  (dr-cloaad  <door>)) 

(add  (dr-opaa  <door>))))) 


(CLOSE 

(paraiM  (<door>>> 

(pracoada  (aad 
(ia-door  <door» 

(aaxt-to  robot  <door>) 
(dr-opaa  <door>))) 

(affacta  ( 

(dal  (dr-opaa  <door>)) 

(add  (dr-cloaad  <door>))))) 

(LOCI 

(paraaa  (<door>  <kay>  <rooB>)) 
(pracoada  (aad 
(ia-door  <door>) 

(ia-kay  <door>  <kay>) 
(holdiag  <kay>) 

(dr-to-rm  <door>  <rooa>) 
(ia-rooa  <rooa>) 

(iarooa  <kay>  <rooa>) 
(aaxt-to  robot  <door>) 
(dr-cloaad  <door>) 

(ualockad  <door>))) 

(affacta  ( 

(dal  (ualockad  <door>)) 

(add  (lockad  <door>))))) 

(UILQCI 

(paraaa  (<door>  <kay>  <rooa>)) 
(pracoada  (aad 
(ia-door  <door>) 

(ia-kay  <door>  <kay>) 
(holdiag  <kay>> 

(dr-to-ra  <door>  <rooa>) 
(ia-rooa  <rooa>) 

(iarooa  <kay>  <rooa>) 

(iarooa  robot  <rooa>> 
(aaxt-to  robot  <door>) 

( lockad  <door> ) ) ) 

(affacta  ( 

(dal  (lockad  <door>)) 

(add  (ualockad  <door>))))) 
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A. 3  Incomplete  Domains 

The  12  preconditions  missing  in  following: 


operator 

precondition 

pickup-obj 

(zmn- empty) 

push-to-dr 

(dr-to-rm  <door>  <room>) 

go-thru-dr 

(dr-open  <door>) 

carry-thru-dr 

(connects  <door>  <room>  <other-room>) 

carry-thru-dr 

(next-to  robot  <door>  ) 

goto-obj 

(inroom  robot  <room>) 

open 

(imlocked  <door>) 

open 

(next-to  robot  <door>) 

lock 

(next-to  robot  <door>) 

unlock 

1 

(holding  <key>) 

unlock 

(inroom  robot  <room>) 

unlock 

(next-to  robot  <door>) 

The  8  effects  missing  in  Dp^^t2o  ^^e  following: 


operator 

postcondition 

pickup-obj 

(del  (next-to  <*other-ob31>  <object>)) 

putdown-next-to 

(del  (holding  <*other-ob35>) ) 

push-thru-dr 

(del  (next-to  <*other-ob7>  <object>)) 

push-thru-dr 

(del  (inroom  robot  <*other-ob21>)) 

carry-thru-dr 

(del  (inroom  robot  <*other-ob41>)) 

carry-thru-dr 

(del  (inroom  <object>  <*other-ob42>)) 

open 

(del  (dr-closed  <door>)) 

close 

(del  (dr-open  <door>)) 
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The  28  preconditions  missing  in  are  the  following: 


operator 

precondition 

pickup-obj 

(aarm- empty) 

pickup-obj 

(next-to  robot  <object>) 

putdown 

(holding  <object>) 

putdown-next-to 

(inroom  <object>  <room>) 

push-to-dr 

(inroom  <object>  <room>) 

push-to-dr 

(next-to  robot  <object>) 

push-thru-dr 

(dr-to-rm  <door>  <room>) 

push-thru-dr 

(dr-open  <door>) 

push-thru-dr 

(next-to  <object>  <door>) 

push- thru- dr 

(inroom  <object>  <other-room>) 

carry-thru-dr 

(next-to  robot  <door>) 

goto-dr 

(dr-to-rm  <door>  <room>; 

push- box 

(pushable  <object>) 

push- box 

(next-to  robot  <object>) 

goto-obj 

(inroom  <object>  <room>) 

open 

(dr-closed  <door>) 

close 

(next-to  robot  <door>) 

close 

(dr-open  <door>) 

lock 

(holding  <key>) 

lock 

(dr-to-rm  <door>  <other-room>) 

lock 

(inroom  <key>  <other-room>) 

lock 

(next-to  robot  <door>) 

lock 

(dr-closed  <door>) 

unlock 

(holding  <key>) 

unlock 

(dr-to-rm  <door>  <room>) 

unlock 

(inroom  <key>  <room>) 

unlock 

(inroom  robot  <room>) 

unlock 

(locked  <door>) 
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The  19  effects  missing  in  are  the  following: 


operator 

postcondition 

putdown 

(add  (next-to  robot  <object>)) 

putdown-next-to 

(del  (holding  <*other-ob35>)) 

putdown-next-to 

(add  (next-to  robot  <object>)) 

push-to-dr 

(add  (next-to  <object>  <*other-ob5>)) 

push-to-dr 

(add  (next-to  robot  <object>)) 

push-thru-dr 

(del  (next-to  robot  <*other-obl>)) 

push- thru- dr 

(del  (next-to  <object>  <*other-obl2>) ) 

push-thru-dr 

(del  (inroom  <object>  <*other-ob22>)) 

push- thru-dr 

(add  (next-to  robot  <object>)) 

go-thru-dr 

(del  (next-to  robot  <*other-obl9>)) 

carry-thru-dr 

(del  (inroom  <object>  <*other-ob42>)) 

push- box 

(del  (ne;;t-to  <object>  <*other-ob5>)) 

push- box 

(do*  vuext-to  <*other-ob6>  <object>)) 

push- box 

(next-to  robot  <objeci;>)) 

push-box 

(add  (next-to  robot  <other-ob>)) 

goto-obj 

(del  (next-to  robot  <*other-Qbl09>)) 

close 

(del  (dr-open  <door>)) 

lock 

(del  (unlocked  <door>)) 

unlock 

(del  (locked  <door>)) 

A. 4  Training  and  Test  Problems 

The  problems  used  to  test  this  domain  are  a  subset  of  those  used  in  [Minton,  1988]. 
The  problems  were  generated  randomly,  following  a  procedure  described  in  the  reference 
mentioned.  We  used  60  training  problems  and  12  test  problems.  Problems  1  to  20  are 
taken  from  psO  and  psl,  and  are  called  Trainl.  Problems  21  to  40  are  taken  from  ps2 
and  ps3  and  form  Train2.  Problems  41  to  60  are  taken  from  ps5  and  ps6,  and  are  called 
Trainl.  The  twelve  test  problems  are  in  ps4. 


A. 5  Tables  of  Results 


This  section  presents  the  numerical  results  that  were  used  for  the  graphs  in  Chapter  6. 


A.5.  TABLES  OF  RESULTS 


125 


A. 5.1  Missing  20%  of  the  Preconditions 

The  following  table  shows  the  numerical  results  that  are  summarized  in  Figure  6.1  (20% 
incompleteness ) : 


number  of 
training  problems 

cumulative  number  of 
learning  opportunities 

number  of  plans 
successfully  executed  in  test  set 

0 

0 

2 

20 

8 

10 

40 

9 

12 

60 

10 

12 

Notice  that  after  training  with  Train2,  100%  of  the  test  problems  can  be  solved. 
However,  the  domain  knowledge  is  still  not  complete,  so  EXPO  continues  learning  new 
facts  in  subsequent  training  problems. 

New  preconditions  for  were  learned  by  EXPO  in  the  following  order: 

1.  (next-to  robot  <door>)  of  CARRY-THRU-DR 

2.  (next-to  robot  <door>)  of  UNLOCK 

3.  (holding  <key>)  of  UNLOCK 

4.  (next-to  robot  <door>)  of  OPEN 

5.  (inroom  robot  <room>)  of  GOTO-OBJ 

6.  (unlocked  <door>)  of  OPEN 

7.  (dr-open  <door>)  of  GO-THRU-DR 

8.  (arm-empty)  of  PICKUP-OB.J 

9.  (next-to  robot  <door>)  of  LOCK 

10.  (dr-to-rm  <door>  <room>)  of  PUSH-TO-DR 
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A. 5. 2  Missing  50%  of  the  Preconditions 

The  following  table  shows  the  numerical  results  that  are  summarized  in  Figure  6.2  (50% 
incompleteness); 


number  of 
training  problems 

cumulative  number  of 
learning  opportunities 

number  of  plans 
successfully  executed  in  test  set 

0 

0 

0 

20 

13 

1 

40 

14 

1 

60 

17 

4 

Testset 

18 

12 

In  this  case,  4  of  the  12  test  problems  cannot  be  successfully  executed  after  training 
with  all  the  training  sets.  This  is  due  to  the  nature  of  the  training  sets,  which  may  not 
uncover  all  the  necessary  failures.  This  is  shown  by  training  EXPO  with  the  test  set, 
after  which  all  the  test  problems  can  be  solved. 

New  preconditions  for  were  learned  by  EXPO  in  the  following  order; 

1.  (next-to  robot  <object>)  of  PICXUP-OBJ 

2.  (holding  <object>)  of  PUTDOWN 

3.  (dr-closed  <door>)  of  LOCK 

4.  (next-to  robot  <door>)  of  CLOSE 

5.  (holding  <key>)  of  LOCK 

6.  (next-to  robot  <door>)  of  LOCK 

7.  (inroom  <key>  <room>)  of  LOCK 

8.  (next-to  robot  <door>)  of  CARRY-THRU-DR 

9.  (next-to  robot  <object>)  of  PUSH-TO-DR 

10.  (next-to  <object>  <door>)  of  PUSH-THRU-DR 

11.  (inroom  <object>  <room>)  of  PUSH-TO-DR 
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12.  (holding  <key>)  of  UNLOCK 

13.  (inroom  <object>  <room>)  of  GOTO-OBJ 

14.  (arm-empty)  of  PICKUP-OBJ 

15.  (dr-to-rm  <door>  <room>)  of  GOTO-DR 

16.  (inroom  robot  <room>)  of  UNLOCK 

17.  (dr-open  <door>)  of  PUSH-THRU-DR 

A. 5. 3  Missing  20%  of  the  Effects 

The  following  table  shows  the  numerical  results  obtained  from  EXPO  that  are  summa¬ 
rized  in  Figure  ??  for  the  domain  with  20%  incompleteness. 


number  of 
training  problems 

cumulative  number  of 
learning  opportunities 

number  of 
incorrect  predictions 

0 

0 

52 

20 

1 

48 

40 

3 

26 

60 

5 

10 

The  postconditions  learned  by  EXPO  given  ^his  order): 

1.  (del  (dr-open  <door>) )  of  CLOSE 

2.  (del  (dr-closed  <door>))  of  OPEN 

3.  (del  (inroom  robot  <other-room>) )  of  CARRY-THRU-DR 

4.  (del  (next-to  robot  <object>) )  of  PICKUP-OBJ 

5.  (del  (inroom  robot  <*var>) )  of  PUSH-THRU-DR 

Notice  that  items  3  and  4  are  more  specific  than  the  effects  that  actually  appear  in  the 
original  domain.  But  in  fact,  in  item  3  the  effect  (del  (inroom  robot  <other-i-c'om>)) 
learned  by  EXPO  is  the  correct  one  for  the  operator:  (del  (inroom  robot  <*other-ob41>)) 
is  overly  general  since  the  robot  is  leaving  the  room  <other-room>.  In  item  4,  (del 
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(next-to  robot  <object>))  is  overly  specific  because  <object>  ceases  to  be  next  to 
anything  else  besides  the  robot.  In  this  case,  EXPO  can  learn  that  fact  by  adding  another 
effect  (del  (next-to  <*var>  <object>)). 


A. 5. 4  Missing  50%  of  the  Effects 

The  following  table  shows  the  numerical  results  obtained  from  EXPO  that  are  summa¬ 
rized  in  Figure  ??  for  the  domain  with  50%  incompleteness. 


number  of 
training  problems 

cumulative  number  of 
learning  opportunities 

number  of 
incorrect  predictions 

0 

0 

59 

20 

5 

24 

40 

7 

20 

60 

7 

20 

The  postconditions  learned  by  EXPO  given  are  (in  this  order): 

1.  (add  (next-to  robot  <object>) )  of  PUSH-TO-DR 

2.  (add  (next-to  robot  <object>))  of  PUSH-THRU-DR 

3.  (del  (next-to  robot  <*var>) )  of  GOTO-OBJ 

4.  (del  (dr-open  <door>) )  of  CLOSE 

5.  (del  (inroom  <object>  <*var>) )  of  PUSH-THRU-DR 

6.  (del  (next-to  robot  <*var>) )  of  GO-THRU-DR 

7.  (add  (next-to  robot  <other-ob>) )  of  PUSH-BOX 


Appendix  B 

The  Process  Planning  Domain 


This  appendix  describes  the  process  planning  domain  used  in  the  examples  and  in  the 
empirical  test  of  this  thesis.  This  domain  is  different  from  the  scheduling  domain  used  in 
other  work  in  PRODIGY.  First,  the  appendix  describes  a  domain  and  gives  a  quantitative 
and  qualitative  characterization  of  it.  Then,  the  implementation  in  the  PRODIGY  system 
is  listed.  The  rest  of  the  appendix  includes  the  incomplete  versions  and  problems  used 
in  the  empirical  tests,  and  the  numerical  results  obtained  that  were  used  in  the  graphs 
for  Chapter  6. 

A  more  complete  description  of  the  technical  content  of  this  process  planning  speci¬ 
fication  can  be  found  in  [Gil,  1991]. 

This  domain  was  chosen  to  test  EXPO  because  it  is  very  elabotate  and  knowledge 
intensive.  The  variety  of  alternative  processes,  their  complexity,  and  their  interactions 
make  the  planning  task  very  complex. 


B.l  Description  of  the  Domain 

Process  planning  is  a  major  component  of  product  manufacturing.  .4  product  is  designed 
to  satisfy  some  desired  set  of  specifications.  .4  product  is  typically  made  of  several 
components,  also  called  parts.  When  the  design  is  completed,  production  continues  by 
planning  the  sequences  of  processes  to  be  performed  on  raw  material  to  produce  a  part. 
This  process  planning  includes  operations  to  machine,  join  and  finish  parts.  Machining 
processes  include  cutting  the  part  to  a  certain  size,  inflicting  a  feature  such  as  a  hole, 
and  producing  a  certain  roughness  on  a  surface.  Joining  operations  include  bolting  and 
welding  parts.  Finishing  operations  give  the  part  a  certain  surface  coating,  such  as  a  rust 
resistant  finish. 
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Figure  B.l:  The  Setup  for  a  Drilling  Operation 

Each  operation  involves  a  machine,  a  holding  device  to  grasp  the  part,  and  a  tool. 
Figure  B.l  depicts  a  setup  for  drilling  a  hole. 

A  drilling  machine  holds  a  tool  called  a  drill  bit,  and  on  its  table  there  is  a  holding 
device  cadled  a  vise  that  is  grasping  the  part. 

There  are  many  constraints  for  the  tools  and  holding  devices  that  can  be  used  with 
each  machine. 

An  expert  machinist  assisted  in  the  construction  of  the  domain,  and  helped  with  the 
description  of  real  machine  setups  and  sample  parts  for  constructing  problems.  Figure 
B.2  shows  an  actual  request.  It  is  one  of  the  examples  included  in  [Hayes,  1990],  selected 
from  a  job  shop  that  serves  the  Mechanical  Engineering  Department  of  Ctirnegie  Mellon 
University. 
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This  domain  can  be  qualitatively  and  quantitatively  described  eis  follows: 

•  Some  qualitative  parameters  are: 

—  There  are  117  rules,  that  include  73  operators  and  44  inference  rules. 

-  There  are  33  different  types  of  objects. 

—  There  are  93  predicates.  55  of  them  are  static  (i.e.,  do  not  change  during 
problem  solving),  27  of  them  are  closed  world  (i.e.,  appear  in  the  effects  of 
some  operator  but  not  in  the  effects  of  inference  rules),  20  of  them  are  open 
world  (i.e.,  deduced  by  inference  rules),  and  7  are  computed  by  Lisp  functions. 

-  The  average  number  of  parameters  for  an  operator  is  5. 

-  The  average  number  of  preconditions  for  an  operator  is  8. 

—  The  average  number  of  effects  for  an  operator  is  6. 

-  163  preconditions  and  154  effects  are  learnable,  a  total  of  317  learnable  items. 

•  The  effects  of  most  operators  are  not  reversible. 

•  The  precondition  expression  of  some  operators  involves  facts  not  present  in  the 
state  such  as  negations,  predicates  computed  by  functions,  and  predicates  derived 
by  inference  rules. 

•  There  are  context-dependent  effects  in  some  operators. 

As  explained  in  Appendix  A,  type  specifications  and  backchaining  effects  are  not 
learnable  by  EXPO. 


B.2  The  Domain 


The  domain  operators,  inference  rules,  and  function  predicates  are  listed  below. 


B.2.  THE  DOMAIN 
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B.2.1  Operators 

;  HACHIIE:  DULL 

:  oparators  for  aakiBg  holos 

(DKILL-UITH-SPOT-DIILL 
(paraM  (<aachiBa>  <(lrill-bit>  <haldiBg-dav> 

<part>  <hola>  <aida>)) 

(procoBds  (aBd 

(ia-a  <part>  PABT) 

(is-a  <Mchiaa>  DKILL) 

(is-a  <drill-bit>  SPOT-DBILL) 

(holdiBg-tool  <Mchiaa>  <drill-bit>) 

(holdlBg  <aachiaa>  <holdiag-dav>  <part>  <sida>)>) 
(affocts  ( 

(dal  (is-claao  <part>)) 

(add  (haa-burra  <part>)) 

(add  (haa-apot  <part>  <hola>  <aida>  <lac-z> 
<loc-y»)))) 

(DBILL-HITH-TUIST-DBILL 
(paraaM  (<aaebiBa>  <drill-bit>  <holdiBg-dav> 

<part>  <hola>  <aida>  <hola-dapth> 
<hola-dlaB>) ) 

(pracoBda  (aad 

(ia-a  <part>  PABT) 

(ia-a  <aachiBa>  DBILL) 

(aaaM  <drill-bit-diaBi>  <hola-diaa>) 
(diaaatar-of-drill-bit  <drill-bit> 

<drill-bit-diaa» 

(ia-a  <drill-bit>  THIST-DBILL) 

(haa-apot  <part>  <hola>  <aida>  <loc-x>  <loc-y>> 
(holdiBg-tool  <aachiBa>  <drill-bit>) 

(holdiBg  <aachioa>  <holdlBg-dav>  <part>  <aido>))) 
(affacta  ( 

(dal  (ia-claaa  <part>)) 

(add  (haa-borra  <part>)) 

(dal  (haa-apot  <part>  <holt>  <aida>  <loc-x> 
<loc-y>)) 

(add  (haa-hola  <part>  <hola>  <aida>  <hola-dtpth> 
<hola-diaa>  <loc-x>  <loc-y>))))) 

(DBILL-HITH-HIOH-HELIX-DBILL 
(paraaa  (<aachiBa>  <drill-bit>  <holding-dav> 

<part>  <hola>  <xlda>  <hola-dapth> 
<hola-diaa>)) 

(pracooda  (aod 

(ia-a  <part>  PABT) 

(ia-a  <aachiaa>  DBILL) 

(aaaa  <drill-bit-diaa>  <hola-diaa>) 
(diaaatar-of-drill-bit  <dr ill-bit > 

<drill-bit-diaa» 

(ia-a  <drill-bit>  HiaH-HELIZ-DBILL) 

(haa-flaid  <aachlBa>  <flaid>  <part>) 

(haa-apot  <part>  <hola>  <alda>  <loc-x>  <loc-y>) 
(holdiBg-tool  <aachiBa>  <drill-bit>) 

(holding  <aachina>  <holding-daT>  <part>  <aido>))) 


(affacta  ( 

(dal  (ia-claan  <part>)) 

(add  (haa-burra  <part>)) 

(dal  (haa-apot  <part>  <hola>  <aida>  <loc-x> 
<loc-y>)) 

(add  (haa-hola  <part>  <hola>  <aida>  <hola-dapth> 
<hola-dlaa>  <loc-x>  <loc-y>))))) 

(DBILL-BITH-STBAIOHT-FLUTKD-DBILL 
(paraaw  (<aachioa>  <drill-bit>  <holdiog-daT> 

<part>  <hola>  <8ida>  <hola-dapth> 
<hola-diaa») 

(praconda  (and 

(ia-a  <part>  PABT) 

(ia-a  <Bachiaa>  DBILL) 

(aaaa  <drill-bit-diaB>  <hola-diaB>) 
(diaaatar-of-drill-bit  <drill-bit> 

<drill-bit-diaa>) 

(ia-a  <drill-bit>  STBAIGHT-FLUTED-DBILL) 

(aaallar  <hola-dapth>  2) 

(aatarial-of  <part>  BBASS). 

(haa-apot  <part>  <hola>  <8ida>  <loc-x>  <loc-y>) 
(holding-tool  <aachina>  <drill-bit>) 

(holding  <aachina>  <holding-dav>  <part>  <aida>))) 
(affacta  ( 

(dal  (ia-claan  <part>)) 

(add  (haa-burra  <part>)) 

(dal  (haa-apot  <part>  <hola>  <aida>  <loc-x> 
<loc-y>)) 

(add  (haa-hola  <part>  <hola>  <aida>  <hola-dapth> 
<hola-diaa>  <loc-x>  <loc-y>))))) 

(DBILL-UITH-OIL-HOLE-DBILL 
(paraaia  (<aachina>  <drill-bit>  <holding-dav> 

<part>  <hola>  <aida>  <hola-dapth> 
<hola-diaa>)) 

(praconda  (and 

(ia-a  <part>  PABT) 

(ia-a  <Bachina>  DBILL) 

(aaaio  <drill-bit-diaB>  <hole-diaB>) 
(diaaatar-of-drill-bit  <drill-bit> 

<drill-bit-diaa> ) 

(ia-a  <drill-bit>  OIL-HOLE-DBILL) 

(aaallar  <hola-dapth>  20) 

(haa-fluid  <aachina>  <fluid>  <part>) 

(haa-apot  <part>  <hola>  <8ida>  <loc-x>  <loc-y>) 
(holding-tool  <aachina>  <drill-bit» 

(holding  <aachino>  <holding-dav>  <part>  <8ide>))) 
(affacta  ( 

(dal  (ia-claan  <part>)) 

(add  (haa-burra  <part>)) 

(dal  (haa-apot  <part>  <hola>  <8ida>  <loc-x> 
<loc-y>)) 

(add  (haa-hola  <part>  <hola>  <aida>  <hola-dapth> 
<hola-diaa>  <loc-x>  <loc-y>))))) 

(DBILL-VITH-GUI-DBILL 
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(paraaa  (<Mchiiia>  <drilX~bit>  <holding-dav> 

<part>  <hol*>  <sld«>  <hola-d«pth> 
<hol*-diui>)) 

(praconda  (and 

(ia-a  <part>  PAST) 

(ia-a  <aachina>  DRILL) 

(aaaa  <drill~bit-diaB>  <hola-diaa>) 
(diaaatar-of-drill~bit  <drill-bit> 

<drill-bit-dia«>) 

(ia-a  <drill-bit>  QUI-DRILL) 

(haa-flold  <Bachlna>  <flald>  <part>) 

(haa-apot  <part>  <hola>  <aida>  <loc-»>  <loc-y>) 
(holding-tool  <aaehiaa>  <drill-bit>) 

(holding  <MChina>  <holding-dav>  <part>  <aida>))) 
(off act a  ( 

(dal  (ia-claan  <part>)) 

(add  (haa-bnrra  <part>)) 

(dal  (haa-apot  <part>  <hola>  <Bida>  <loc-x> 
<loc-y>)) 

(add  (haa-hola  <part>  <hola>  <8ido>  <hola-dapth> 
<hola-dlaai>  <loc-x>  <loc-y>))))) 


(DRILL-UITH-CEITER-DRILL 
(paraaa  (<aachina>  <drlll-blt>  <holdlng-dav> 

<part>  <hola>  <aida>  <drill-bit-diaa> 
<loc-x>  <loc-y>)) 

(praconda  (and 

(ia-a  <part>  PART) 

(ia-a  <aachiaa>  DRILL) 

(diaaatar-of-drill-bit  <drill-bit> 

<drill-bit-diaa>) 

(aaaa  <drill-bit-diaB>  <hola-diaa>) 

(ia-a  <dTill-bit>  CEITER-DRILL) 

(haa-apot  <part>  <hola>  <8ida>  <loc-x>  <loc-y>> 
(holding-tool  <aachlna>  <drill-bit>) 

(holding  <aachina>  <holding-dav>  <part>  <aida>))) 
(affacta  ( 

(dal  (ia-claan  <part>)) 

(add  (haa-bnrra  <part>)) 

(dal  (haa-apot  <part>  <holt>  <aida>  <loc-x> 
<loc-y>)) 

(add  (haa-hola  <part>  <hola>  <8ida>  1/8 

<hola-diaB>  <loc-x>  <loc-y>)) 

(add  (haa-cantar-hola  <part>  <hola>  <8ida> 
<loc-x>  <loc-y>))))) 

;  oparatora  for  finiahing  holaa 

(TAP 

(paraan  (<aachina>  <drill-bit>  <holding-dav> 

<part>  <hola>))  . 

(praconda  (and 

(ia-a  <part>  PART) 

(ia-a  <aachina>  DRILL) 

(aaaa  <drill-bit-diaa>  <hola-diaa>) 
(diaaatar-of-drill-bit  <dr 111-bit > 

<drill-bit-diaa>> 

(ia-a  <drlll-blt>  TAP) 

(haa-hola  <part>  <hola>  <8ida>  <hola-dapth> 


<hola-diaa>  <loc-x>  <loc-y>) 
(holding-tool  <aachina>  <drill-bit>) 

('  (haa-bnrra  <part>)) 

(ia-claan  <part>) 

(holding  <nachina>  <hol'  '»y>  <part>  <8ida>))) 
(affacta  ( 

(dal  (ia-claan  <part>)) 

(add  (haa-bnrra  <part>)) 

(dal  (ia-raaaad  <part>  <hola>  <aida>  <hola-dapth> 
<hola-diaa>  <loc-x>  <loc-y>)) 

(add  (ia-tappad  <part>  <hola>  <aida>  <hola-dapth> 
<hola-diaB>  <loc-x>  <loc-y>))))) 

(COUITERSIIt 

(paraaa  (<aachina>  <drill-bit>  <holding-dav> 

<part>  <hola>)) 

(praconda  (and 

(ia-a  <part>  PART) 

(ia-a  <Bachina>  DRILL) 

(angla-of-drill-bit  <drill-bit>  <angla>) 

(ia-a  <drill-bit>  COUITERSIIK) 

(haa-hola  <part>  <hola>  <8ida>  <hola-dapth> 
<hola-diaa>  <loc-x>  <loc-y>) 
(holding-tool  <Bachina>  <drill-bit>) 

(■  (haa-bnrra  <part>)) 

(ia-claan  <part>) 

(holding  <Bachina>  <holding-dav>  <part>  <aida>))) 
(affacta  ( 

(dal  (ia-claan  <part>)) 

(add  (haa-bnrra  <part>>) 

(add  (ia-countarainkad  <part>  <hola>  <aida> 
<hola-dapth>  <hola-diaB>  <loc-x> 
<loc-y>  <angla>))))> 

(COUITERBORS 

(paraBa  (<Bachina>  <drill-bit>  <holding-dav> 

<part>  <hola>)) 

(praconda  (and 

(ia-a  <part>  PART) 

(ia-a  <Bachina>  DRILL) 

(aiza-of-drill-bit  <drill-bit>  <conntorboro-siza>) 
(ia-a  <drill-bit>  CDUITERBORE) 

(haa-hola  <part>  <hola>  <aida>  <hola-dapth> 
<hola-diaB>  <loc-x>  <loc-y>) 
(holding-tool  <Bachina>  <drill-bit>) 

('  (haa-bnrra  <part>)) 

(ia-claan  <part>) 

(hold'ng  <Bachina>  <holding-daT>  <part>  <8ida>))) 
(affacta  ( 

(dal  (ia-claan  <part>)) 

(add  (haa-bnrra  <part>)) 

(add  (ia-conntarborad  <part>  <hola>  <8ida> 
<hola-dapth>  <hola-diaB>  <loc-x> 

<loc-y>  <conntarbora-8iza>))))) 

(REAR 

(paraBa  (<Bachina>  cdrill-bit>  <holding-daa>  <part> 
<hola>  <8ida>  <hola-dapth>  <hola-diaB>)) 
(praconda  (and 

(ia-a  <part>  PART) 
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(i«-a  <aachin«>  DULL) 

(auM  <drlll-bit-<liaa>  <hola-diaB>) 
(dlaMtar-of-drill-bit  <drill-bit> 

<drlll-bit-dii«» 

(ia-a  <drill-bit>  UAHSl) 

(aaallar  <hola-‘dapth>  2) 

(haa-flaid  <MchiBa>  <flaid>  <part>) 

(haa-hola  <part>  <bola>  <aida>  <hola-dapth> 
<hola~diaai>  <lae~x>  <lac-y>) 
(holding-tool  <anchiaa>  <drill-bit>) 

('  (haa-bnrra  <pnrt>)) 

(ia-cloan  <p«rt>) 

(holding  <nchino>  <holding-dav>  <pnrt>  <aido>))) 
(affocta  ( 

(dal  (ia-clann  <pnrt>)) 

(add  (han-burra  <part>)) 

(dal  (ia-tappad  <part>  <hola>  <aida>  <hola-dapth> 
<hola-diaai>  <loc-x>  <loc-y>)) 

(add  (la-raaaad  'part>  <hola>  <aida>  <hola-dapth> 
<hola-diaa>  <loc-x>  <loc-y>))))) 


:  NiCHIlE:  NILLIia  MCmilE 
(SIDE-MILL 

(paraaa  (<aachiaa>  <part>  <ailling-cuttar> 

<holdlng-dav>  <aida>  <diai>  <valua>)) 
(praconda  (and 

(ia-a  <part>  PAET) 

(ia-a  <Bachlna>  HILLIld-UCHIlE) 

(ia-of-typa  <ailllng-cattar>  NILLIIG-CUTTER) 

(or  (aaaa  <dia>  WIDTH) 

(aaaa  <dla>  LEEOTH)) 

(aixo-of  <part>  <dia>  <valua-old>) 

(aaallor  <valoa>  <*alna-old>) 

(aaallar-than-lin  <valaa-old>  <valua>) 
(aida-np-for-aachining  <dia>  <aida>) 

(holding-tool  <aachina>  <Billing-cuttor>) 

(holding  <aachina>  <holding-dav>  <part>  <aida>>)) 
(affacta  ( 

(dal  (ia-claan  <part>)) 

(add  (haa-burra  <part>)> 

(dal  (anrfaca-coating-aida  <part>  <aida> 

<*aurfaca-coating>)) 

(dal  (anrfaca-f inlah-aida  <part>  <aida>  <*a-q>)) 
(add  (anrfaca-finiah-aida  <part>  <aida> 

ROUGH-HILL)) 

(add  (aixa-of  <part>  <dia>  <Talua>)) 

(dal  (aixa-of  <part>  <diB>  <¥alua-old>)) ) ) ) 

(FACE-HILL 

(paraaa  (<aachina>  <part>  <ailling-cuttar> 

<holdlng-dav>  <alda>  <dia>  <Talua>)) 
(praconda  (and 

(ia-a  <part>  PART) 

(ia-a  <aachina>  HILLIIO-HACHIIE) 

(la-of-typa  <ailling-eattar>  HILLIIG-CUTTER) 

(aaaa  <dia>  HEIGHT) 


(aixa-of  <part>  <dia>  <valaa-old>) 

(aaallor  <valuo>  <valao-old>) 
(aida-up-for-aachining  <dia>  <aida>) 

(holding-tool  <aachina>  <ailling-cuttar» 

(holding  <aachina>  <holding-dav>  <part>  <aide>))) 
(affacta  ( 

(dal  (ia-claan  <part>)) 

(add  (haa-burra  <part>)) 

(dal  (anrfaca-coating-aida  <part>  <aida> 

<aaurf aca-coat ing> ) ) 

(dal  (aurfaca-finiah-aida  <part>  <aida>  <*a-q>)) 
(add  (aurfaca-finiah-aida  <part>  <aida> 

ROUGH-HILL)) 

(add  (aixa-of  <part>  <diB>  <valua>)) 

(dal  (aixo-of  <part>  <di>>  <valua-old>))))) 

(DRILL-HITH-SPOT-DRILL-II-HILLIIG-HACHIIE 
(paraaw  (<aiachina>  <drill-bit>  <holding-daT> 

<part>  <hola>  <aido>)) 

(praconda  (and 

(ia-a  <part>  PART) 

(ia-a  <BMchina>  HILLIIG-HACHIIE) 

(ia-a  <drill-bit>  SPOT-DRILL) 

(holding-tool  <Bachina>  <drill-bit>) 

(holding  <nachina>  <holding-dav>  <part>  <aido>))) 
(affocta  ( 

(dal  (ia-claan  <part>)) 

(add  (haa-burra  <part>)) 

(add  (haa-apot  <part>  <hola>  <aida>  <loc-x> 
<loc-y»)))) 

(DRILL-HITH-TWIST-ORILL-tl-HILLIIG-HACHIHE 
(paraau  (<Bachino>  <drill-bit>  <holding-dav> 

<part>  <hola>  <aida>  <hola-dapth> 
<hola-diaai>) ) 

(praconda  (and 

(ia-a  <part>  PART) 

(ia-a  <Bachino>  HILLIIG-HACHIIE) 

(aaaia  <drill-bit-diaai>  <hola-diaBi>) 
(diaaiatar-of-drill-bit  <drill-bit> 

<drill-bit-diaB>) 

(ia-a  <drill-bit>  TWIST-DRILL) 

(haa-apot  <part>  <holo>  <aida>  <loc-x>  <loc-y>) 
(holding-tool  <nachino>  <drill-bit>) 

(holding  <Bachino>  <holding-dtT>  <part>  <aido>))) 
(affacta  ( 

(dal  (ia-cloan  <part>)) 

(add  (haa-burra  <part>)) 

(dal  (haa-apot  <part>  <hola>  <aida>  <loc-x> 
<loc-y>) ) 

(add  (haa-hola  <part>  <hola>  <8ido>  <hola-dapth> 
<hola-diaB>  <loc-^>  <loc-y>))))) 

;  HACHIIE:  LATHE 

(ROUQH-TURI-RECTAIGULAR-PART 
(paraBa  (<Bachino>  <part>  <toolbit>  <holding-dav> 
<diaBatar-nan>) ) 

(praconda  (and 
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(ia-a  <BachiBa>  LATHS) 

(is-a  <toolblt>  UOaH-TOOLBIT) 

(ahapa-of  <paTt>  HECTAIOULAH) 

(aiaa-of  <part>  HBIQHT  <h>) 

(aiaa-of  <part>  HIDTH  <a>) 

(aaallar  <diaaMtar-aaa>  <h>) 

(aawllar  <diaMtar-Baw>  <b>) 

(holdiBg-to«l  <a«chiBa>  <toalbit>) 
(aida-ap-for-BachlBiBg  OIAHETEH  <aida>) 

(holdlBg  <BachiBa>  <holdiBg-dav>  <part>  <aida>)>) 
(af facta  ( 

(dal  (la-claaB  <part>)) 

(add  (haa-barra  <part>)) 

(dal  (aiaa-of  <part>  HIIQHT  <h>)) 

(dal  (aiaa-of  <part>  UIDTH  <a>)) 

(add  (aiaa-of  <part>  OIAHETEH  <diaaatar-naa>)) 
(dal  (aarfaca-coatiag-aida  <part>  SIDEl 

<asurfaca-caating>)) 
(dal  (anrfaca-coatiBg-aida  <part>  SIDE2 

<aaurfaca-coating>)) 
(dal  (aarfaca-coatiBg-aida  <part>  SI0E4 

<aaurf aca-eoat ing> ) ) 
(dal  (aarfaca-coatlBg-aida  <part>  SIDES 

<aaurf  aca-coating»  ) 
(dal  (aurfaea-coatiag-aida  <part>  SIDEO 

<aaurf aca-coating>) ) 

(dal  (aarfaca-f iaiah-aida  <part>  SIDEl  <afl>)> 

(dal  (aarfaca-f ioiab-aida  <part>  SIDE2  <af2>)) 

(dal  (aarfaca-f iaiab-aida  <part>  SI0E4  <af4>)) 

(dal  (aarfaca-f iaiah-aida  <part>  SIDES  <afS>)) 

(add  (aarfaca-fiaiah-aida  <part>  SIDEO 

BOOGH-TURI))))) 

(ROU<!K-TUHl-<rrLIHDBICAL-PART 
(paraaa  (<aachiBa>  <part>  <toolbit>  <halding-dav> 
<diaBataT-naa>)) 

(pracoada  (aad 

(ia-a  <aachiaa>  LATHS) 

(ia-a  <toolbit>  ROUaH-TOOLBIT) 

(ahapa-of  <part>  CTLIHORICAL) 

(aiza-of  <part>  DIAHETER  <diaai>) 

(aaallar  <diaBatar-Baa>  <diaa>) 

(boldiag-tool  <aachiBa>  <toolbit>) 
(aida-ap-for-mchiaiag  DIAHETER  <aida>) 

(boldiag  <aachiaa>  <holdiag-dav>  <part>  <aida>))) 
(affacta  ( 

(dal  (ia-claaa  <part>)) 

(add  (haa-barra  <part>)) 

(dal  (aiza-of  <part>  DIAHETER  <diBa>)) 

(add  (aiza-of  <part>  DIAHETER  <diaBatar-naB>)) 
(dal  (aarfaca-coatiag-aida  <part>  SIDEO 

<aaarfaca-coating>)) 

(dal  (aarfaca-fiaiah-aida  <part>  SIDEO  <af>)) 

(add  (aarfaca-fiaiah-aida  <part>  SIDEO 

ROUGH-TURH))))) 

(FIIISH-TURH 

(paraaa  (<Bachiaa>  <part>  <toalbit>  <holding-dav> 
<dlaaiatar-Baa>)  ) 

(pracoada  (aad 


(ia-a  <BachiBa>  LATHE) 

(ia-a  <toolbit>  FIBISH-TOOLBIT) 

(abapa-of  <part>  CTLIHORICAL) 

(aiza-of  <part>  DIAHETER  <diaB>) 

(fiaiahiag-aiza  <diaBi>  <diaaatar-naB>) 
(holdiag-tool  <aachiBa>  <toolbit>) 

('  (haa-barra  <part>)) 

(ia-claaa  <part>) 

(boldiag  <BachiBa>  <holdiBg-dav>  <part>  SIDEO))) 
(affacta  ( 

(dal  (ia-claaa  <part>)) 

(add  (haa-barra  <part>)) 

(dal  (aiza-of  <part>  DIAHETER  <diaa») 

(add  (aiza-of  <part>  DIAHETER  <diaaiatar-aaa>)) 

(dal  (aarfaca-coatiag-aida  <part>  SIDEO 

<aaorfaca-coatiag>) ) 

(dal  (aarfaca-fiaiah-aida  <part>  SIDEO  <af>)) 

(add  (aarfaca-fiaiah-aida  <part>  SIDEO 

FIHISH-TURB))))) 

(HAEE-THREAD-UITH-LATHE 

(paraaa  (<aachina>  <part>  <holding-dav>  <aida>)) 
(pracoada  (aad 

(ia-a  <part>  PART) 

(ia-a  <aachioa>  LATHE) 

(ia-a  <toolbit>  V-THREAD) 

(ahapa-of  <part>  CTLIHORICAL) 

(holdiag-tool  <aachina>  <toolbit>) 

(*  (haa-barra  <part>)) 

(ia-claaa  <part>) 

(boldiag  <aachiaa>  <holdiag-dav>  <part>  SIDED))) 
(affacta  ( 

(dal  (ia-claaa  <part>)) 

(add  (haa-barra  <part>)) 

(dal  (aarfaca-coatiag-aida  <part>  SIDEO 

<*8urf aca-coat ing> ) ) 

(dal  (aarfaca-fiaiah-aida  <part>  SIDEO  <af>)) 

(add  (aarfaca-fiaiah-aida  <part>  SIDED  TAPPED))))) 

(HAIE-IIURL-UITH-LATHE 

(paraaa  (<aachiaa>  <part>  <holdiag-d8T>  <8ida>)) 
(pracoada  (aad 

(ia-a  <part>  PART) 

(ia-a  <Bachiaa>  LATHE) 

(ia-a  <taolbit>  EHURL) 

(ahapa-of  <part>  CTLIHDRICAL) 

(holdiag-tool  <Bacbiaa>  <toolbit>) 

(■  (haa-barra  <part>)) 

(ia-claaa  <paTt>) 

(holdiag  <Bachiaa>  <holdiag-dav>  <part>  SIDED))) 
(affacta  ( 

(dal  (ia-claaa  <part>)) 

(add  (haa-barra  <part>)) 

(dal  (aarfaca-coatiag-aida  <part>  SIDEO 

<*aarf aca-coat iag> ) ) 

(dal  (aarfaca-fiaiah-aida  <part>  SIDEO  <8f>)) 

(add  (aarfaca-fiaiah-aida  <part>  SIDED 

IHURLED))))) 

(FILE-HITH-LATHE 
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(paruu  (<MChln*>  <part>  <holding-<l«v> 
<lath«-fll«>  <diuMt«r-n«a>) ) 

(pracoada  (and 

(is-a  <part>  PAST) 

(ia-a  <Mchina>  LATHS) 

(ia-a  <latlia-fila>  LATHS-FILE) 

(ahapa-of  <part>  CYLIIDBICAL) 

(aiaa-af  <part>  OIAHSTEB  <diaai>) 

(finiahiag-aiza  <diaBi>  <diaaiatar-naa>) 

(*  (haa-burra  <part>)) 

(ia-claaa  <part>) 

(haldiag  <Bachina>  <haldlag-dav>  <part>  SIDEO))) 
(affacta  ( 

(dal  (la-claaa  <part>)) 

(add  (haa-barra  <part>)) 

(dal  (aiza-af  <part>  DIAMETEE  <diaai>)) 

(add  (alza-af  <part>  OIAHSTEE  <diaBatar-naa>)) 
(dal  (aarfaca-caatiag-aida  <part>  SIDEO 

<aaBrfaea-caating>)) 

(dal  (aarfaea-f iaiah-aida  <part>  SIDEO  <af>)) 

(add  (aarfaca-fiaiah-aida  <part>  SIDEO 

EOUaH-ORIlD))))) 

(POLISH-HITM-LATHS 

(paraaa  (<iaachiBa>  <part>  <halding-dav>  <clath> 
<diaaatar-aaa>) ) 

(praeaada  (aad 

(ia-a  <part>  PART) 

(ia-a  <aMehiaa>  UTUS) 

(ia-a  <cloth>  ABRASIVE-CLOTH) 
(aatarial-of-abraaiva-eloth  <cloth>  EHERY) 
(ahapa-of  <part>  CTLIIDRICAL) 

(*  (haa-barra  <part>)) 

(ia-claaa  <part>) 

(holdiag  <Bachiaa>  <holding-dav>  <part>  SIDEO))) 
(affacta  ( 

(dal  (ia-claaa  <part>)) 

(add  (haa-barra  <part>)) 

(dal  (aarfaca-coatiag-aida  <part>  SIDEO 

<aaarfaca-caating>)) 
(dal  (aarfaca-fiaiah-aida  <part>  SIDEO  <*8-q>)) 
(add  (aarfaca-fiaiah-aida  <part>  SIDEO 

POLISHED))))) 

;  HACHIHE:  SHAPER 

(ROUCH-SHAPE 

(paraau  (<Bachiaa>  <part>  <catting-tool> 

<holdiag-dav>  <aida>  <dia>  <valaa>)) 
(pracoada  (aad 

(ia-a  <part>  PART) 

(ia-a  <Mchina>  SHAPER) 

(ia-a  <cattiag-to<>l>  ROOOHIlG-CUTTIiC-TOOL) 
(aiza-of  <part>  <dia>  <Talaa-old>) 

(aaallar  <valaa>  <valBa-old>) 
(aida-ap-formchiaiag  <diB>  <aida>) 

(holdiag-tool  <hiachiaa>  <catting-tool>) 

(holdiag  <h8china>  <holdiag-dav>  <part>  <aida>))) 
(affacta  ( 


(dal  (ia-claaa  <part>)) 

(add  (haa-bnrra  <part>)) 

(dal  (aarfaca-coatiag-aida  <part>  <aida> 

<aaurf aca-coat iag> ) ) 

(dal  (aarfaca-fiaiah-aida  <part>  <aida>  <aa-q>)) 
(add  (aarfaca-fiaiah-aida  <part>  <aida> 

ROUGH-SHAPED) ) 

(add  (aiza-of  <part>  <diB>  <valaa>)) 

(dal  (aiza-of  <part>  <diB>  <valao-old>))))) 

(FIHISH-SHAPE 

(paraaa  (<Bachina>  <part>  <catting-tool> 

<holdiag-dav>  <8ida>  <diB>  <valoa>)) 
(pracoada  (and 
(ia-a  <part>  PART) 

(ia-a  <Bachiaa>  SHAPER) 

(ia-a  <cotting-tool>  FIHISHIHG-CUTTIHG-TOOL) 
(aiza-of  <part>  <diB>  <valoa-old>) 

(finiahing-aiza  <valaa-old>  <valaa>) 
(aida-ap-for-Bachining  <diB>  <8ida>) 

(holding-tool  <Bachina>  <cntting-tool>) 

(*  (haa-bnrra  <part>)) 

(ia-claan  <part>) 

(holding  <Bachina>  <holding-dav>  <part>  <aida>))) 
(affacta  ( 

(dal  (ia-claan  <part>)) 

(add  (haa-bnrra  <part>)) 

(dal  (aarfaca-coatiag-aida  <part>  <aida> 

<aanrf aca-coat ing> ) ) 

(dal  (anrfaca-finiah-aido  <part>  <8ida>  <a8-q>>> 
(add  (anrfaco-finiab-aida  <part>  <aida> 

FIHISH-SHAPED)) 

(add  (aizo-of  <part>  <diB>  <valna>)) 

(dal  (aiza-of  <part>  <diB>  <valao-old>))))) 

;  HACHIHE:  PLAIER 

(ROUGH-SHAPE-HITH-PLAHER 
(paraBZ  (<Bachino>  <part>  <cntting-tool> 

<bolding-dov>  <8ida>  <diB>  <Talno>)) 
(praconda  (and 
(ia-a  <part>  PART) 

(ia-a  <Bachina>  PLAHER) 

(ia-a  <cntting-tool>  ROUGHIIG-CUTTIHG-TOOL) 
(aiza-of  <part>  <diB>  <valna-old>) 

(zBallar  <valna>  <Talna-old>) 
(aida-ap-for-Bachining  <diB>  <8ida>) 

(holding-tool  <aachina>  <cntting-tool>) 

(holding  <aachino>  <halding-dav>  <part>  <8ida>))) 
(affacta  ( 

(dal  (ia-claan  <part>)) 

(add  (haa-bnrra  <part>)) 

(dal  (anrfaca-coating-aida  <part>  <8ida> 

<a8nrfaca-coating>)) 

(dal  (anrfaca-finiah-aida  <part>  <8ida>  <*8-q>)) 
(add  (anrfaca-finiah-aida  <part>  <8ida> 

RDUGH-PLAHED) ) 

(add  (aiza-of  <part>  <diB>  <*alna>)) 

(dal  (aiza-of  <part>  <din>  <valna-old>))))) 
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(PIlISa-SHAPE-HITH-PLAm 
(parus  «Mchin«>  <p«rt>  <entting-taal> 

<holding-d«tr>  <«id«>  <diB>  <valu«>)) 
(praconds  (and 

(ia-a  <part>  PUT) 

(is-a  <MChina>  PLAIEK) 

(is-a  <cnttiiig-tool>  PIlISHIIG-CUTTIia-TOQL) 
(sixa-of  <part>  <dia>  <valaa~ald>) 

(finiahing-aiza  <Talaa-old>  <valua>) 
(aida-ap-for-ucliiiilBg  <diB>  <aida>) 

(holding-toal  <aachina>  <cutting-toal>) 

(■  (haa-barra  <part>)) 

(ia-claaa  <part>) 

(holding  <Bachina>  <holding-dav>  <part>  <aida>))) 
(affacta  ( 

(dal  (la-claan  <part>)) 

(add  (haa-borra  <part>)) 

(dal  (aarfaca-coating-alda  <part>  <8ida> 

<aaurf aca-eoat ing>) ) 

(dal  (anrfaca-flniah-aida  <part>  <aida>  <*a-q>)) 
(add  (anrfaea-finiah-aida  <part>  <aida> 

FIIISH-PLAIED)) 

(add  (aiza-of  <part>  <diB>  <valua>)) 

(dal  (aiza-of  <part>  <diB>  <valua~old>))))) 

:  NACmilE;  atllDEA 

( ROUOH-atIII>>HITH-MAIU>-UHEEL 
(paraaia  (<aachina>  <part>  <nhaal>  <holding-dav> 
<aida>  <dia>  <*aloa>)) 

(praconda  (and 
(la-a  <paTt>  PART) 

(la-a  <Mehlna>  GRIIDER) 

(la-a  <nhaal>  GRIIDIia-WHEEL) 

(haa-flnid  <Mchinn>  <fluid>  <part>) 
(hardnaaa-of-ahaal  <ahaal>  HARO) 

(hardnaaa-of  <part>  SOFT) 

(-  (aatarlal-of  <part>  BROIZE)) 

(-  (aatarial-of  <part>  COPPER)) 

(grlt-of-ohaal  <ahnal>  COARSE-GRIT) 

(aiza-of  <part>  <dia>  <Talua-old>) 

(aaallar  <valuo>  <valaa-old>) 
(aida-np-for-aachining  <dia>  <aida>) 

(holding-tool  <aachiaa>  <ahaal>) 

(holding  <aachina>  <holding-daT>  <part>  <8ida>))) 
(affacta  ( 

(dal  (ia-elaan  <part>)) 

(add  (haa-burra  <part>)) 

(dal  (anrfaca-coating-aida  <part>  <aida> 

<aaarf aca-coating>) ) 

(dal  (aarfaca-finiah-aida  <part>  <aida>  <at-q>)) 
(add  (aarfaca-finiah-aida  <part>  <aida> 

ROUGH-GRIID) ) 

(add  (aiza-of  <part>  <diB>  <¥alua>)) 

(dal  (aiza-of  <part>  <di«>  <Talua-old>))))) 

(ROUGH-GRIED-HITH-SOFT-HHEEL 
(paraaia  (<MChino>  <part>  <nhoal>  <holdlng-dav> 


<aida>  <dlB>  <valaa>)) 

(praconda  (and 

(ia-a  <part>  PART) 

(la-a  <Mchiaa>  GRIIDER) 

(ia-a  <Bhaal>  GRIIDIIG-UHEEL) 

(haa-flnid  <Bachina>  <flaid>  <part>) 
(hardnaaa-of-ahaal  <nhaal>  SOFT) 

(hardnaaa-of  <part>  HARO) 

(grit-of-ahaal  <ahaal>  COARSE-GRIT) 

(aiza-of  <part>  <dia>  <valua-old>) 

(aaallar  <valua>  <valna-old>) 
(aida-up-for-aachining  <dia>  <aida>) 

(holding-tool  <aachina>  <ahaal>) 

(holding  <aachina>  <holding-dov>  <part>  <aida>))) 
(affacta  ( 

(dal  (ia-claan  <part>)) 

(add  (haa-burra  <part>)) 

(dal  (aarfaca-coating-aida  <part>  <aida> 

caaurf aca-coat ing> ) ) 

(dal  (aurfaca-finiah-aida  <part>  <8ida>  <a8-q>)) 
(add  (aurfaca-finiah-aida  <part>  <aida> 

ROUGH-GRIID) ) 

(add  (aiza-of  <part>  <diB>  <valua>)) 

(dal  (aiza-of  <part>  <diB>  <valua-old>))))) 

(FIIISH-GRIIO-UITH-HARO-UHEEL 
(paraaa  (<Bachina>  <part>  <ahoal>  <holding-dav> 
<aida>  <dia>  <valua>)) 

(praconda  (and 

(ia-a  <part>  PART) 

(ia-a  <Bachina>  GRIIDER) 

(ia-a  <ahaal>  GRIIDIIG-HHEEL) 

(haa-fluid  <BBchina>  <fluid>  <part>) 
(hardnaaa-of-Bhoal  <nhaal>  HARO) 

(hardnaaa-of  <part>  SOFT) 

('  (aatorial-of  <part>  BROIZE)) 

(*  (aatarial-of  <part>  COPPER)) 

(grit-of-uhoal  <nhaal>  FIIE-GRIT) 

(aiza-of  <part>  <diB>  <valua-old>) 

(finiahing-aiza  <valaa-old>  <valua>) 
(aida-up-for-aachining  <dia>  <8ido>) 

(holding-tool  <aachina>  <uhaal>) 

(■  (haa-barra  <part>)) 

(ia-claan  <part>) 

(holding  <aachina>  <holding-dav>  <part>  <aide>))) 
(affacta  ( 

(dal  (ia-claan  <part>)) 

(add  (haa-burra  <part>)) 

(dal  (aurfaca-coating-aida  <part>  <aida> 

<*aurf aca-coat ing> ) ) 

(dal  (aurfaca-finiah-aida  <part>  <Bida>  <*a-q>)) 
(add  (aurfaca-finiah-aida  <part>  <aida> 

FIIISH-GRIID)) 

(add  (aiza-of  <part>  <dia>  Cvaluai)) 

(dal  (aiza-of  <part>  <diB>  <valua-old>))))) 


(FIIISH-GRIID-HITH-SOFT-UHEEL 
(paraaa  (<aachina>  <part>  <uhaal>  <holding-dav> 
<aida>  <dia>  <valaa>)) 
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(praconda  (and 

(ia-a  <part>  PART) 

(ia-a  <Bachina>  QRIIDEl) 

(ia-a  <nhaal>  GRIIDIIQ-WHEEL) 

(haa-flnid  <Baehina>  <flaid>  <part>) 
(hardnaaa-of-ahnal  <ahanl>  SOFT) 

(hardnaaa-of  <part>  HARO) 

(grit-of-ahaal  <ahaal>  FIIE-GRIT) 

(aiza-of  <part>  <dia>  <valua-old>) 

(finiahiag-aiza  <valaa-old>  <valua>) 
(aida-ap-for-Mchinlng  <diB>  <8ida>) 

(holding-tool  <Mchina>  <Bhaal>) 

(■  (haa-bnrra  <part>)) 
ila-claan  <part>) 

(holding  <aachina>  <holding-dav>  <part>  <aida>))) 
(affacta  ( 

(dal  (ia-claan  <part>)) 

(add  (haa-bnrra  <part>)) 

(dal  (anrfaca-coating-aida  <part>  <aida> 

<*aurfaca-caating»  ) 

(dal  (anrfaca-finiah-aida  <part>  <aida>  <*a-q>)) 
(add  (anrfaca-finiah-aida  <part>  <aida> 

FlIISH-GRIID)) 

(add  (aiza-of  <part>  <diB>  <valua>)) 

(dal  (aiza-of  <part>  <diB>  <valua-old>))))) 

;  NACHIIS:  CIRCUUR-SAU 

(CUT-HITR-CIRCQLU-COUKSAU 
(paraaa  (<aiachina>  <part>  <attaeluiant> 
<holding-da*>  <diB>  <valua>)) 

(proconda  (and 

(ia-a  <part>  PART) 

(ia-a  <Mchiaa>  CIRCUUR-SAU) 

(ia-a  <attacluaont>  COLO-SAH) 

(aiza-of  <part>  <dia>  <Talua-old>) 

(aaiallar  <valua>  <valua-old>) 
(aida-up-for-aachining  <diB>  <8ida>) 

(holding-tool  <aachina>  <attachaant>) 

(holding  <aBchina>  <holding-dav>  <part>  <aida>))) 
(affacta  ( 

(dal  (ia-claan  <part>)) 

(add  (haa-burra  <part>)) 

(dal  (anrfaca-coating-aida  <part>  <aida> 

<aaurfaca-coating>) ) 

(dal  (anrfaca-finiah-aida  <part>  <aida>  <*8-q>)) 
(add  (anrfaca-finiah-aida  <part>  <aida> 

FIIISH-HILD) 

(dal  (aiza-of  <part>  <dia>  <¥alua-old>)) 

(add  (aiza-of  <part>  <dia>  <valua>))))) 

(CUT-UITH-CIRCULAR-FRICTIOI-SAH 
(paraaa  (<aachina>  <part>  <attachaant> 
<holding-dav>  <dia>  <Talua>)) 

(praconda  (and 

(ia-a  <part>  PART) 

(ia-a  <aachina>  CIRCUUR-SAU) 

(ia-a  <attachaant>  FRICTIOl-SAU) 

(haa-flnid  <Bachina>  <flnid>  <part>) 


(aiza-of  <part>  <dia>  <valua-old>) 

(aaallar  <valua>  <valua-old>) 
(aida-np-for-aachining  <dia>  <aida>) 

(holding-tool  <aachina>  <attachBant>) 

(holding  <aachina>  <holding-dav>  <part>  <8idt>))) 
(affacta  ( 

(dal  (ia-claan  <part>)) 

(add  (haa-burra  <part>)) 

(dal  (anrfaca-coating-aida  <part>  <8ida> 

<a8urfaca-coating>)) 

(dal  (anrfaca-finiah-aida  <part>  <aida>  <*8-q>)) 
(add  (anrfaca-finiah-aida  <part>  <8ida> 

ROUGH-NIU.)) 

(dal  (aiza-of  <part>  <diB>  <yalita-old>)) 

(add  (aiza-of  <part>  <diB>  <valua>))))) 

;  NACHIIE:  BAID-SAU 

(CUT-UITH-BAID-SAU 

(paraaa  (<aachina>  <part>  <attachaant>  <dia> 
<valuo>)) 

(praconda  (and 

(ia-a  <part>  PART) 

(ia-a  <Bachina>  BAID-SAU) 

(ia-a  <attachnant>  SAU-BAID) 

(aiza-of  <part>  <dia>  <valua-old>) 

(aaallar  <valua>  <valua-old>) 
(aida-up-for-aachining  <dia>  <8ida>) 

(holding-tool  <Bachina>  <attachaant>) 

('  (haa-burra  <part>)) 

(ia-claan  <part>) 

(on-tabla  <Bachina>  <part>))) 

(affacta  ( 

(dal  (ia-claan  <part>)) 

(add  (haa-burra  <part>)) 

(dal  (anrfaca-coating-aida  <part>  <8ida> 

<a8urfaca-coating>)) 

(dal  (anrfaca-finiah-aida  <part>  <aida>  <*8-q>)) 
(add  (anrfaca-finiah-aida  <part>  <8ida>  SAUCUT)) 
(dal  (aiza-of  <part>  <dia>  <Talua-old>)) 

(add  (aiza-of  <part>  <diB>  <valuo>))))) 

(POLISH-UITH-BAID-SAU 

(paraM  (<aachino>  <part>  <attachnont>  <8ido>)) 
(praconda  (and 

(ia-a  <part>  PART) 

(ia-a  <Bachina>  BAID-SAU) 

(ia-a  <attachaont>  SAU-BAID) 
(aida-up-for-aachining  <dia>  <8ida>) 

(holding-tool  <aachina>  <attachaant>) 

('  (haa-burra  <part>)) 

(ia-claan  <part>) 

(on-tabla  <nachina>  <part>))) 

(affacta  ( 

(dal  (ia-claan  <part>)) 

(add  (haa-burra  <part>)) 

(dal  (anrfaca-coating-aida  <part>  <aida> 

taaurf aca-coat ing> ) ) 
(dal  (anrfaca-finiah-aida  <part>  <8ida> 
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<*old-«f-cond>) ) 

(add  (aarfaca-fiaish-aida  <part>  <aida> 

POLISHED))))) 


;  lUCHIlE:  HELOEB 

(HELO-CTLIIDEIS-NETAL-UC 
(paraaa  (<Mchina>  <partl>  <part2>  <part> 

<alactroda>  <holdlBg-dav>  <langth>)) 
(praconda  (and 

(ia-a  <partl>  PUT) 

(ia-a  <part2>  PART) 

('  (aaM  <partl>  <part2>)) 

(la-a  <aachlaa>  NETAL-ARC-UELDER) 

(ia-a  <alactroda>  ELECTRODE) 

(aatarial-of  <partl>  <Mtariall>) 

(■atarlal-of  <part2>  <Mtarial2>) 

(ahapa-of  <partl>  CYLIIDRICAL) 

(ahapa-of  <part2>  CTLIEDRICAL) 

('  (axiata  (<hola>) 

(haa-hola  <partl>  <hola>  <*aida>  <*dapth> 
<*diaat>  <aloc-x>  <»loc-y>))) 

(*  (axiata  (<hola>) 

(haa-hola  <part2>  <hola>  <»aida>  <«dapth> 
<«diaa>  <aloc-x>  <»loc-y>))) 
(aixa-of  <parti>  DIAHETER  <diaaatarl>) 

(aixa-of  <part2>  DIAHETER  <diaaatar2>) 

(aaM  <diaMtarl>  <diaaatar2>) 

(aixa-of  <partl>  LEROTH  <laaEthl>) 

(aixa-of  <part2>  LEIOTH  <lanEth2>) 

(naa-aixa  <lanEthl>  <laagth2>  <longth>) 

(naa-part  <part>  <partl>  <part2>) 

(naa-Mtarial  <aiatarial>  <Batariall>  <aatarial2>) 
(holdin^tool  <aachiaa>  <alactroda>) 

(holding  <aachina>  <holding-dav>  <part2>  SIDES))) 
(off acta  ( 

(dal  (ia-a  <partl>  PART)) 

(dal  (ia-a  <part2>  PART)) 

(add  (ia-a  <part>  PART)) 

(add  (Mtarial-of  <part>  <Mtarial>)) 

(add  (aixa-of  <part>  DIAHETER  <diaaatarl>)} 

(add  (aixa-of  <part>  LEIOTH  <langth>)> 

(add  (aurfaca-finiah-aida  <part>  SIDED  SAMCtTT)) 
(if  (aurfaca-finiah-aida  <partl>  SIDES  <afSl>) 
(add  (aurfaca-finiah-aida  <part>  SIDES 

<afSl>))) 

(if  (aurfaca-finiah-aida  <part2>  SIDED  <af62>> 
(add  (aurfaca-finiah-aida  <part>  SIDED 

<afD2>))) 

(dal  (holding  <aMchiaa>  <holding-dav>  <part2> 
SIDES)) 

(add  (holding  <a«chiBa>  <holding-daT>  <part> 
SIDES)) 

(dal  (aixa-of  <partt>  DIAHETER  <diaa>)) 

(dal  (aixa-of  <partl>  LSIOTH  <langthl>)) 

(dal  (aixa-of  <part2>  DIAHETER  <diaai>)) 

(dal  (aixa-of  <part2>  LEIOTH  <langth2>)) 

(dal  (Mtarial-of  <partl>  <Mtariall>)) 


(dal  (Mtarial-of  <part2>  <Mtarial2>)) 

(dal  (ia-claan  <partl>)) 

(dal  (ia-claan  <part2>)) 

(dal  (aurfaca-coating-aida  <partl>  <*aidaa> 

<*aurf-coatiBga>) ) 

(dal  (aurfaca-coating-aida  <part2>  <*aidab> 

<*aurf-coatingb>) ) 

(dal  (aurfaca-finiah-aida  <partl>  <*aidac> 

<*afc>)) 

(dal  (aurfaca-finiah-aida  <part2>  <*aidad> 

<*afd>))))) 

(UELD-CTLIIDERS-GAS 

(paraaa  (<BachiBa>  <partl>  <part2>  <part>  <rod> 
<holding-dav>  <langth>)) 

(praconda  (and 

(ia-a  <partl>  PART) 

(ia-a  <part2>  PART) 

('  (aaaa  <partl>  <part2>)) 

(ia-a  <MchiBa>  OAS-UELDER) 

(ia-a  <rod>  HELDIID-ROD) 

(ia-a  <torch>  TORCH) 

(aatarial-of  <partl>  <aatoriall>) 

(aatarial-of  <part2>  <aatarial2>) 

(aaaa  <aatariall>  <aatarial2>) 

(ahapa-of  <partl>  CYLIIDRICAL) 

(ahapa-of  <part2>  CYLIIDRICAL) 

(*  (axiata  (<hola>) 

(haa-hola  <partl>  <hola>  <*8ida>  <*dapth> 
<*diaa>  <*loc-x>  <*loc-y>))) 

('  (axiata  (<bola>) 

(haa-hola  <part2>  <hola>  <*8ida>  <*dopth> 
<*diaa>  <aloc-x>  <»loc-y>))) 
(aixa-of  <partl>  DIAHETER  <diaaotarl>) 

(aixa-of  <part2>  DIAHETER  <diaaotar2>) 

(aaM  <diaaatarl>  <diaaatar2>) 

(aiza-of  <parti>  LEIGTH  <langthl>) 

(aixa-of  <part2>  LEIGTH  <langth2>) 

(naa-aixa  <langthl>  <langth2>  <length>) 

(naa-part  <part>  <partl>  <part2>) 

(holding  <aachina>  <holding-doT>  <part2>  SIDES))) 
(affacta  ( 

(dal  (ia-a  <partl>  PART)) 

(dal  (ia-a  <part2>  PART)) 

(add  (ia-a  <part>  PART)) 

(add  (aatarial-of  <part>  <aatariall>)) 

(add  (aiza-of  <part>  DIAHETER  <diaaotarl>)) 

(add  (aiza-of  <pBrt>  LEIGTH  <langth>)) 

(add  (aurfaca-finiah-aida  <part>  SIDED  SAHCUT)) 
(if  (aurfaca-finiah-aida  <partl>  SIDES  <afSl>) 
(add  (aurfaca-finiah-aida  <part>  SIDES 

<afSl»)) 

(if  (aurfaca-finiah-aida  <part2>  SIDES  <afS2>) 
(add  (aurfaca-finiah-aida  <part>  SIDED 

<afS2>))) 

(dal  (holding  <aachina>  <holding-doT>  <part2> 
SIDES) ) 

(add  (holding  <aachina>  <holding-daT>  <part> 
SIDES)) 

(dal  (aixa-of  <partl>  DIAHETER  <diaa>)) 
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(dal  (aiza-of  <partl>  LEIOTH  <langthl>)) 

(dal  (aiza-of  <part3>  DIMETER  <diu>)) 

(dal  (aiza-of  <part2>  LEIOTH  <langth3>)) 

(dal  (Mtarial-of  <partl>  <Batariall>) ) 

(dal  (utarial-of  <part2>  <Batarial2>) ) 

(dal  (ia-claan  <partl>)> 

(dal  (ia-claan  <part2>>) 

(dal  (anrfaca-coating-aida  <partl>  <*aidaa> 

<*aurfaca-coatinga>)) 

(dal  (anrfaca-coating-aida  <part2>  <*aidab> 

<aBurfaca-coatingb>)) 

(dal  (anrfaca-f iniah-aida  <partl>  <*aidac> 

<aafc>)) 

(dal  (aurfaca-f iniah-aida  <part2>  <*aidad> 

<aafd»)))) 

;  HETAL-COATIIO 

( HETAL-SPRAT-COATI lO-CORROS lOI-RES I STAIT 
(paraaia  (<Bachina>  <Bira>  <part>  <aida> 

<anothar-aachina>  <holding-dav>) ) 
(praconda  (and 

(ia-a  <part>  PART) 

(ia-a  <aachina>  ELECTTRIC-ARC-SPRAY-CUI) 

(ia-a  <Bira>  SPRATIIO-HETAL-UIRE) 
(■atarial-of-aira  <nira>  STAIRLESS-STEEL) 

('  (aiatarial-of-aira  <«ira>  TUIOSTEI)) 

(*  (aatarial-of-aira  <aira>  HOLYBDEIUH)) 
(ia-claan  <part>) 

('  (haa-bnrra  <part>)) 

(anrfaca-coating-aida  <part>  <aida>  FUSED-HETAL) 
(ia-of-typa  <anothar-aachina>  HACHIIE) 

(holding  <anothar-aachina>  <holding-dav>  <part> 
<aida>) ) ) 

(affacta  ( 

(add  (anrfaca-coating-aida  <part>  <aida> 

CORROSIOR-RESISTAIT) ) 
(dal  (anrfaca-coating-aida  <part>  <sida> 

FUSED-HETAL))))) 

(RETAL-SPRAY-CaATIlC-KEAT-RESISTAIT 
(paraaa  (<aachina>  <nira>  <part>  <aida> 

<anothar-aachina>  <holding-dav>) ) 
(praconda  (and 

(ia-a  <part>  PART) 

(ia-a  <aachina>  ELECrTRIC-ARC-SPRAY-OUR) 

(ia-a  <nira>  SPRAYIRO-HETAL-HIRE) 
(aatarial-of-aira  <nira>  ZIRCOIIUH-OXIDE) 

(■  (aatarial-of-aira  <aira>  TUIOSTEI)) 

(.-  (aatarial-of-aira  <aira>  HOLYBDEIUH)) 
(ia-claan  <part>) 

('  (haa-bnrra  <part>)) 

(anrfaca-coating-aida  <part>  <aida>  FUSED-HETAL) 
(ia-of-typa  <anothar-aachina>  HACHIIE) 

(holding  <anothar-nachina>  <holding-dav>  <part> 
<aida>))) 

(affacta  ( 

(add  (a'trf aca-coating-aida  <part>  <8ida> 

HEAT-RESISTAIT)) 


(dal  (anrfaca-coating-aida  <part>  <aida> 

FUSED-HETAL))))) 


(HETAL-SPRAY-COATIIO-HBAn-nESISIAIT 
(paraaa  (<aachina>  <Bira>  <part>  <aida> 

<anothar-aachina>  <holding-daT>)) 

(praconda  (and 

(ia-a  <part>  PART) 

(ia-a  <Bachina>  ELECTRIC-ARC-SPRAY-OUI) 

(ia-a  <nira>  SPRAYIIO-HETAL-HIRE) 
(aatarial-of-aira  <Bira>  ALUHIIUH-OXIDE) 

(*  (aatarial-of-Bire  <Bira>  TUIOSTEI)) 

('  (aatarial-of-aira  <Bira>  HOLYBDEIUH)) 

(ia-claan  <part>) 

('  (haa-bnrra  <part>)) 

(anrfaca-coating-aida  <part>  <aida>  FUSED-HETAL) 
(ia-of-typa  <anothar-Bachina>  HACHIIE) 

(holding  <anothar-Bachina>  <holding-dav>  <part> 
<8ida>) ) ) 

(affacta  ( 

(add  (anrfaca-coating-aida  <part>  <aida> 

HEAR-RESISTAIT)) 

(dal  (anrfaca-coating-aida  <part>  <aida> 

FUSED-HETAL))))) 

(HETAL-SPRAY-PREPARE 
(paraaa  (<aachina>  <Bira>  <part>  <aida> 

<anothar-aachina>  Cholding-dav>) ) 

(praconda  (and 

(ia-a  <part>  PART) 

(ia-a  <aachina>  ELECTRIC-ARC-SPRAY-OUI) 

(ia-a  <Bir8>  SPRAYIIO-HETAL-HIRE) 
(haa-high-aalting-point  <Bira>) 

(ia-claan  <part>) 

('  (haa-bnrra  <part>)) 

(ia-of-typa  <anothar-Bachina>  HACHIIE) 

(holding  <anothar-aachino>  <holding-doT>  <part> 
<aido>) ) ) 

(affacta  ( 

(dal  (anrfaca-coating-aida  <part>  <aida>  <*s-f>)) 
(add  (anrfaca-coating-aida  <part>  <aida> 

FUSED-HETAL))))) 

;  OTHER  OPERATIOIS 
(CLEAR 

(paraaa  (<part>)) 

(praconda  (and 

(ia-a  <part>  PART) 

(ia-availabla-part  <part>))) 

(affacta  ( 

(add  (ia-claan  <part>))))) 

(REHOVE-BURRS 
(paraaa  (<part>  <brnah>)) 

(praconda  (and 

(ia-a  <part>  PART) 

(ia-a  <brnah>  BRUSH) 
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(is-«v4ilBbl«-part  <part>))) 
(effects  < 

(del  (is-cleen  <part>>) 

(del  (has-burrs  <part>))))) 


operators  for  preparing  the  Bachines 


:  tools  in  uchinas 

(PUT-TOOL-OI-IIILLIia-|IA(n(IIE 
(paraiu  (<Mchine>  <attacluient>) ) 

(preconds  (and 

(is-a  <MChina>  HILLIia-HA(n(IIE> 

(or  (is-of-type  <attachaiant>  NILLIIG-CnrrTER) 
(is~of-typa  <attacluient>  DRILL-BIT)) 

( is-availabla-tool-holdar  <Bachine> ) 
(ia-availabla-tool  <attachaient>) )) 

(effects  ( 

(add  (holding-tool  <Bachina>  <attachaent>))))) 

(PUT- II-DRILL-SPIIDLE 
(paraas  (<aachlna>  <drill-bit>) ) 

(preconds  (and 

(is-a  <aachiaa>  DRILL) 

(is-of-type  <drill-bit>  DRILL-BIl/ 
(is-ayailabla-tool-holder  <aachine>) 

( is-available-tool  <dr ill-bit  > ) ) ) 

(effects  ( 

(add  (holding-tool  <aachina>  <drill-bit>))))) 

(PUT-TOOLBIT-IR-UTHE 
(paraas  (<aachine>  <toolbit>)) 

(preconds  (and 

(is-a  <aachins>  LATHE) 

(is-of-type  <toolbit>  LATHE-TOOLBIT) 
(is-available-tool-holder  <aachine>) 
(is-available-tool  <toolbit>))) 

(affects  ( 

(add  (holding-tool  <aachina>  <toolbit>))))) 

(PUT-CUTTIIO-TOOL-II-SHAPER-OR-PLARER 
(paraas  (<aachins>  <cutting-tool>) ) 

(preconds  (and 

(or  (is-a  <aachina>  SHAPER) 

(is-a  <aachine>  PLAIER)) 

(is-of-type  <cutting-tool>  CUTTIRQ-TOOL) 
(is-availabla-tool-holdar  <aachine>) 
(is-available-tool  <cuttlng-tool>)) ) 

(affects  ( 

(add  (holding-tool  <aachina>  <cutting-tool>) )) ) ) 

(POT-HHEEL-II-ORIIDER 
(paraas  (<aachine>  <Bheel>)) 

(preconds  (and 

(is-a  <aachine>  CRUDER) 

(is-a  <sheel>  CRIIDIIC-UKEEL) 


(is-available-tool-holder  <aachine.  . 
(is-available-tool  <Bheal>))) 

(effects  ( 

(add  (h  Iding-tool  <aachine>  <Bhael>))))) 

(PUT-CIRCULAR-SAH-ATTACHHEIT-II-CIRCULAR-SAH 
(paraas  (<aachina>  <attachaent>) ) 

(preconds  (and 

(is-a  <aachins>  CIR(n)LAR-SAU) 

(is-of-type  <attachaant>  CIRCULAR-SAH-ATTACHHEIT) 
( is-available-tool-holder  <aachina> ) 
(is-available-tool  <attachaant>)) ) 

(effects  ( 

(add  (holding-tool  <Bachina>  <attachaant>))))) 

(PUT-BAHD-SAH-ATTACHHEIT-II-BAID-SAH 
(paraas  (<aachina>  <attachaent>)) 

(preconds  (and 

(is-a  <aachina>  BAID-SAH) 

(is-of-type  <attachaent>  BAID-SAU-ATTACHHEIT) 
(is-available-tool-holder  <aachine> ) 
(is-available-tool  <attachaent>))) 

(effects  ( 

(add  (holding-tool  <aachina>  <attachaant>))))) 

(PUT-ELECTRODE-II-UELDER 
(paraas  (<aachina>  Celsctrode^)) 

(preconds  (and 

(is-a  <aachino>  HETAL-ARC-HELDER) 

(is-a  <elsctrode>  ELECTRODE) 

( is-available-tool-holder  <aachine> ) 
(is-available-tool  <electrode>) ) ) 

(effects  ( 

(add  (holding-tool  <aachine>  <electrods>))))) 

(REHOVE-TOOL-FRON-HACHIIE 
(paraas  (<aachine>  <tool>)) 

(preconds  (and 

(is-of-type  <aachine>  HACHIIE) 

(is-of-type  <tool>  HACHIIE-TOOL) 

(holding-tool  <aachine>  <tool>))) 

(effects  ( 

(del  (holding-tool  <aachine>  <tool>))))) 


;  holding  devices  in  aachines 

(PUT-HOLDIIO-DEVICE-IH-HILLIIG-HACHIIE 
(paraas  (<nachina>  <holding-dev>)) 

(preconds  (and 

(is-a  <aachina>  HILLIIG-HACHIIE) 

(or 

(is-a  <holding-dav>  4-JAH-CHUCK) 

(is-a  <holding-dav>  V-BLOCK) 

(is-a  <holding-dev>  VISE) 

(is-a  <holding-dav>  COLLET-CHUCK) 

(is-a  <holding-dev>  TOE-CLAHP)) 
(is-availabls-tabls  <aachina>  <holding-dsv>) 
(is-available-holding-device  <holding-dav>) ) ) 
(effects  ( 
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(add  (haa'davica  <Bachina>  <holding-dav>))))) 

(PUT-MOLOIia-DEVICE-II-DRILL 
(paraaa  (<Mchina>  <holding-dav>)) 

(praconds  (and 

(is-a  <Mehina.  DULL) 

(or 

(is-a  <holding-da*>  4-JiU-CHUCI) 

(is-a  <haldiag-dav>  f-BLOCK) 

(is-a  <holding-daT>  VISE) 

(is-a  <holding-daa>  TOE-CLAHP)) 
(is-availabla-tabla  <Bachina>  <holding-daV>) 
(is-availabla-holding-davica  <holding-dav>))) 
(affacts  ( 

(add  (has-davica  <Bachina>  <holding-dav>))))) 

(PUT-HOLOIIQ-DEVICE-II-LATHE 
(paraaa  (<Bachins>  <holding-dav>)) 

(praconds  (and 

(is-a  <aachins>  LATHE) 

(or  (ia-a  <holding-dav>  CEITE&S) 

(ia-a  <holding-dav>  4-JAU-CHUCK) 

(is-a  <holding-dav>  COLLET-CHUCE ) ) 
(is-availabla-tabla  <aachina>  <holding-dav>) 
(is-availabla-holding-davica  <holding-dav>))) 
(affacts  ( 

(add  (has-da*ica  <aachina>  <holding-dav>))))) 

(PUT-HOLOIia-OEVICE-II-SHAPER 
(paraas  (<aachina>  <holding-dav>)) 

(praconds  (and 

(is-a  <aacbina>  SHAPER) 

(is-a  <holding-dav>  VISE) 

(is-availabla-tabla  <aachina>  <holding-dav>) 

( is-a*ailabla-holding-davica  <holding-dav>) ) ) 
(offsets  ( 

(add  (has-davica  <aachina>  <holding-dav>))))) 

(PUT-HOLOIIQ-DEVICE-II-PLAIER 
(paraas  (<aachina>  <holding-dav>) ) 

(praconds  (and 

(is-a  <aachins>  PLAIER) 

(is-a  <holding-dav>  TOE-CLAKP) 
(is-availablo-tabls  <aachina>  <holding-doT>) 
(is-availabla-holding-davica  <holding-dav>))) 
(offsets  ( 

(add  (has-davica  <aachina>  <holding-dav>) )' ' ) 

(PUT-HOLDIia-DEVICE-II-dRIIDER 
(paraas  (<aachina>  <holding-dav>) ) 

(praconds  (and 

(is-a  <aachina>  GRIIDER) 

(or  (is-a  <holding-dav>  HAaiETIC-CHUCK) 

(is-a  <holding-dav>  V-BLOCK) 

(is-a  <holding-dav>  VISE)) 
(is-availabls-tablo  <nachina>  <holding-dav>) 
(is-availabla-holding-davica  <holding-dav>))) 
(affacts  ( 

(add  (has-davica  <Bachina>  <holding-dsv>) )))) 


(PUT-HOLDIia-DEVI(X-II-CIRCUUR-SAH 
(paraas  (<aachins>  <holding-dav>)) 

(praconds  (and 

(is-a  <Bachina>  CIR(HnjlR-SAH) 

(or  (is-a  <holding-dav>  VISE) 

(is-a  <holding-dav>  V-BLOCK)) 
(is-availabla-tabla  <Bachina>  <holding-dav>) 
(is-availabla-holding-davica  <holding-dav> ) ) ) 
(offsets  ( 

(add  (has-davica  <Bachina>  <holding-dav>))))) 

(PUT-HOLDIIQ-DEVICE-IH-UELDER 
(paraas  (<aachina>  <holding-dav>)) 

(praconds  (and 

(is-of-typs  <Bachina>  HELDER) 

(or  (is-a  <holding-dav>  VISE) 

(is-a  <holding-dav>  TOE-CLAHP)) 
(is-availabla-tabla  <Bachina>  <holding-dav>) 
(is-availabla-holding-davica  <holding-dav>) ) ) 
(offsets  ( 

(add  (has-davica  <Bachins>  <holding-dsv>))))) 


(REHOVE-HOLOIIG-DEVICE-FROH-HACHIIE 
(paraas  (<Bachina>  <holding-dsv>) ) 

(praconds  (and 

(is-of-typa  <Bachina>  RACHIHE) 

(is-of-typa  <holding-dav>  HOLDIIG-DEVICE) 
(has-davica  <aachina>  <holding-dav>) 
(is-aapty-holding-dsvics  <holding-dsv> 
<Bachins>))) 

(offsets  ( 

(dal  (has-davica  <Bachins>  <holding-dsv>))))) 


;  cutting  fluid  in  aachinas 

(ADD-SOLUBLE-OIL 
(paraas  (<Bachins>  <fluid>)) 

(praconds  (and 

(is-of-typs  <aachins>  HACHIIE) 

(is-a  <part>  PART) 

(or  (aatsrial-of  <part>  STEEL) 

(aatsrial'Of  <part>  ALUHIIUH)) 

(is-a  <fluid>  SOLUBLE-OIL))) 

(affacts  ( 

(add  (has-fluid  <Bachina>  <fluid>  <part>))))) 

(ADD-HIIERAL-OIL 
(paraas  (<Bachina>  <fluid>)) 

(praconds  and 

(is-of-typs  <Bachins>  HACHIIE) 

(is-a  <part>  PART) 

(is-a  <fluid>  HIIERAL-OIL) 

(aatarial-of  <part>  IROI))) 

(offsets  ( 

(add  (has-fluid  <Bachine>  <fluid>  <part>))))) 

(ADD-AIT-CUTTIIG-FLUID 
(paraas  (<Bachina>  <fluid>)) 
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(praconda  (and 

(is-of-typa  <MchiDa>  HACMIIE) 

(la-a  <part>  PART) 

(or  (Battrial-of  <p4rt>  BRASS) 

(■atarial-of  <part>  BROIZE) 

(ntarial-of  <part>  COPPER)) 

(is-of-typa  <fluid>  CUTTIia-FLUIO)) ) 

(offocts  ( 

(add  (haa-fluid  <aachiaa>  <fluid>  <part>))))) 


;  oparatora  Tor  holding  parta  aith  a  davica  in 
a  Mchina 

(PUT-OI-HACHIIE-TABLE 
(paraaa  (<MChina>  <part>)) 

(praconda  (and 

(ia-a  <part>  PART) 

(ia-of-typa  <Bachina>  HACHIIE) 

(-  (ia-a  <aachina>  SHAPER)) 
(ia-availabla-part  <part>) 
(ia-availabla-aachina  <Bachina>) ) ) 

(affacta  ( 

(dal  (on-tabla  <anothar-Mchlna>  <part>)) 
(add  (on-tabla  <aachina>  <part>))))) 


(PUT-OI-SHAPER-TABLE 
(paraM  (<aaehina>  <part>)) 

(praconda  (and 
(ia-a  <part>  PART) 

(ia-a  <aachina>  SHAPER) 

(aiza-of-aachina  <aacbina>  <ahapor-8iza>) 
(aiza-of  <part>  LEIGTH  <part-aize>) 
(aaallar  <part-aiza>  <ahapar-siza>) 
(ia-availabla-part  <part>) 
(ia-availabla-aachina  <aachina>) ) ) 
(affacta  ( 

(dal  (on-tabla  <anothar-aachina>  <part>)) 
(add  (on-tabla  <aachina>  <part>))))) 


(HOLO-UITH-V-BLOCK 

(paraaa  (<aachina>  <holding-dav>  <part>  <8ida>)) 
(praconda 
(and 

(ia-of-typa  <aachina>  HACHIIE) 

(ia-a  <part>  PART) 

(ia-a  <holding-dav>  V-BLOCI) 

(haa-davica  <aachina>  <holding-dav>) 

('  (haa-burra  <part>)) 

(ia-claan  <part>) 

(on-tabla  <aachina>  <part>) 

(ahapa-of  <part>  CTLIIDRICAL) 

(aaara  <aida>  SIDEO) 

(ia-«npty-holding-davica  <holding'dav>  <aachina>) 
(ia-availabla-part  <part>))) 

(affacta  ( 


(dal  (on-tabla  <Bachina>  <part>)) 

(add  (holding-aaakly  <aiachina>  <holding-dav> 

<part>  <aida>))))) 

(HOLS-UITH-VISE 

(paraaM  (<Bachina>  <holding-dav>  <part>  <aida>)) 
(praconda  (and 

(ia-of-typa  <Bachina>  HACHIIE) 

(ia-a  <part>  PART) 

(ia-a  <holding-dav>  VISE) 

(haa-davica  <Bachina>  <holding-dav>) 

('  (haa-burra  <part>)) 

(ia-claan  <part>) 

(on-tabla  <Bachina>  <part>) 

(ia-aapty-holding-davica  <holding'dav>  <aiachine>) 
(ia-availabla-part  <part>))) 

(affacta  ( 

(dal  (on-tabla  <aiachina>  <part>)) 

(if  (ahapa-of  <part>  CTLIIDRICAL) 

(add  (holding-aaakly  <aachina>  <holding-dav> 
<part>  <8ida>))) 

(if  (ahapa-of  <part>  RECTAIGULAR) 

(add  (holding  <Bachina>  <holding-dav> 

<part>  <8ida>)))))) 

(HOLD-HITH-TOE-CLAHP 

(paraaa  (<Bachina>  <holding-dav>  <part>  <aida>)) 
(praconda  (and 

(ia-of-typa  <aachina>  HACHIIE) 

(ia-a  <part>  PART) 

(ia-a  <holding-dav>  TOE-CLAHP) 

(haa-davica  <aachina>  <holding-dav>) 

(■  (haa-burra  <part>)) 

(ia-claan  <part>) 

(or  (ahapa-of  <part>  RECTAIGULAR) 

(aaaa  <8ida>  SIDES) 

(aaaa  <8ida>  SIDES)) 

(on-tabla  <aachina>  <part>) 

(ia-aapty-holding-davica  <holding-dav>  <aachine>) 
(is-availabla-part  <pnrt>))) 

(affacta  ( 

(dal  (on-tabla  <Bachina>  <part>)) 

(add  (holding  <aachina>  <holding-dev>  <part> 
<8ida>))))) 

(SECURE-HITH-TOE-CLAHP 

(paraaa  (<aachina>  <holding-dav>  <part>  <8ide>)) 
(praconda  (and 

(ia-of-typa  <aachina>  HACHIIE) 

(ia-a  <part>  PART) 

(ia-a  <holding-dav>  TOE-CLAHP) 

(haa-davica  <nachina>  <holding-dav» 

(*  (haa-burra  <part>)) 

(ia-claan  <part>) 

(ahapa-of  <part>  CTLIIDRICAL) 

(holding-aaakly  <aachina>  <anothar-holding-devico> 
<part>  <aida>) 

(ia-aapty-holding-davicc  <holding-dav> 

<Bachina>))) 


(affacta  ( 
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(d«l  (on-tabl*  <Mchin«>  <part>)) 

(add  (holding  <MChino>  <holding-dav>  <part> 
<oldo»)))) 

(HOLO-WITH-CEITEBS 

(pnraan  (<MChina>  <holding-dov>  <part>  <sid«>)) 
(praconda  (and 

(is-of-typa  <Bachin«>  UCMIIE) 

(ia-a  <part>  PAKT) 

(ia-a  <holding-dav>  CEITEKS) 

(haa-doTica  <Mchlaa>  <holding-dav>) 
(haa-cantar-holaa  <part>) 

(*  (haa-burra  <part>)) 

(ia-cloaa  <part>) 

(on~tabla  <nachlna>  <part>) 

(ahapo-of  <part>  CTLIIDKICAL) 

(ia-aBpty-holding-da*ica  <holding-dav>  <nachina>) 
(ia-availablo-part  <part>))) 

(affacta  ( 

(dal  (on-tabla  <nachina>  <part>)) 

(add  (holding  <nachina>  <holding-dav>  <part> 
<aida»)))) 

(HOLD-HITH-4-JAU-CiniCI 

(paraaa  (<Bachina>  <holding~dav>  <part>  <8ida>)) 
(praconda  (and 

(ia-of-typa  <nachina>  HACHIIS) 

(ia-a  <part>  PART) 

(ia-a  <holding-dav>  4-JAH-<WUCK) 

(haa-davicc  <aiachina>  <holding-dav>) 

('  (haa-bnrra  <part>)) 

(ia-claan  <part>) 

(on-tabla  <Bachina>  <part>) 

(ia-aapty-holding-davica  <holdlng-dov>  <Bachina>) 
(ia-availabla-part  <part>))) 

(affacta  ( 

(dal  (on-tabla  <Bachina>  <part>)) 

(add  (holding  <nachina>  <holding-dav>  <part> 
<8ida»)))) 

(HOLO-WITH-COLLST-CHUCK 

(paraaw  (<aachina>  <holding-daT>  <part>  <8ida>)) 
(praconda  (and 

(ia-of-typa  <nachina>  NACHIIE) 

(ia-a  <part>  PART) 

(ia-a  <holding-dav>  COLLET-CHUCK) 

(haa-dovica  <nachina>  <holding-dav>) 

('  (haa-burra  <part>)) 

(ia-claan  <part>) 

(on-tabla  <nachina>  <part>) 

(ahapa-of  <part>  CYLIIDRICAL) 

(ia-aaipty-holding-da*ica  <holding-dav>  <nachina>) 
(ia-availabla-part  <part>))) 

(affacta  ( 

(dal  (on-tabla  <nachina>  <part>)) 

(add  (holding  <Bachino>  <holding-dav>  <part> 
<8ida»)))) 

(HOLD-HITH-HACIETIC-CHUCI 
(paraaa  (<nachlna>  <holding-daT>  <part>  <8ldo>)) 


(praconda  (and 

(ia-of-typa  <nachina>  NACHIIE) 

(ia-a  <part>  PART) 

(ia-a  <holding-dav>  NAGIETIC-CHUCK) 

(haa-davica  <nachiaa>  <holding-dav>) 

(*  (haa-burra  <part>)) 

(ia-claan  <part>) 

(on-tabla  <nachina>  <part>) 

(ia-anpty-holding-davica  <holding-daT>  <Bachina>) 
(ia-availabla-part  <part>))) 

(affacta  ( 

(dal  (on-tabla  <Bachina>  <part>)) 

(add  (holding  <>achina>  <holding-dav>  <part> 
<8ida»)))) 

(RELEASE-FROH-HOLDIIO-DEVICE 
(paraaa  (<Bachina>  <holding-dav>  <part>  <aido>)) 
(praconda  (and 

(ia-of-typa  <Bachina>  NACHIIE) 

(ia-a  <part>  PART) 

(ia-of-typa  <holding-daT>  HOLOIIG-DETICE) 

(holding  <Bachina>  <holding-dov>  <part>  <8ida>))) 
(affacta  ( 

(dal  (holding  <Bachino>  <holding-dov>  <part> 
<aida>) ) 

(add  (on-tabla  <Bachina>  <part>))))) 

(RELEASE-FRON-HOLOIIG-DEVICE-UEAK 
(paraaa  (<Bachlna>  <holding-dav>  <part>  <aida>)) 
(praconda  (and 

(ia-of-typa  <nachina>  NACHIIE) 

(ia-a  <part>  PART) 

(ia-of-typa  <holding-dav>  HOLOIIG-DEVICE) 
(holding-oaakly  <Bachina>  <holding-daT>  <part> 
<8ida>) ) ) 

(affacta  ( 

(dal  (holding-aaakly  <nachina>  <holding-dav> 
<part>  <8ida>)) 

(add  (on-tabla  <nachina>  <part>))))) 


B.2.2  Inference  Rules 

(HAS-CEITER-HOLES 
(paraaa  (<part>  <x2>  <y2») 

(praconda  (and 

(ia-a  <part>  PART) 

(or  (and 

(ahapa-of  <part>  RECTAIGULAR) 

(aiza-of  <part>  WIDTH  <x>) 

(aiza-of  <part>  HEIGHT  <y>)) 

(and 

(ahapa-of  <part>  CYLIIDRICAL) 

(aiza-of  <part>  DIANETER  <x>) 

(aiza-of  <part>  DIANETER  <y>))) 

(half-of  <x>  <x2>) 

(half-of  <y>  <y2>) 

(haa-cantar-hola  <part>  CEITER-H0LE-SIDE3  SIDES 
<x2>  <y2>) 
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(ia-coontarsialiad  <part>  CEITEB-H0LE-SIDE3  SIDE3 
1/8  1/18  <x2>  <y2>  60) 

(hu-c*at*T-hol«  <par«>  CEITEK-H0LE-SIDE6  SI0E8 
<x2>  <y2>) 

(is'coantarainkad  <part>  CEITEK-H0LE-SIDE6  SI0E6 
1/8  1/16  <x2>  <y2>  60))) 

(affacta  ( 

(add  (haa-cantar-holaa  <part>))))) 


(SIDE-UP-FOK-HACHIIIIO-LEIOTH 
(paraaa  «alda>)) 

(praconda  (and 

(aaM  <dia>  LEIQTH) 

(or  (aaaa  <aida>  S1DB3) 

(aaM  <aida>  SIDE6)))) 

(affacta  ( 

(add  (aida-ap-for-aachining  <diB>  <aida>))))) 

(SIDE-UP-FOt-IUCHIlIIO-HIDTH 
(paraM  (<alda>)) 

(praconda  (and 

(aaM  <dlB>  HIDTH) 

(or  (aaM  <aida>  SI0E2) 

(aaaa  <aida>  SIDES)))) 

(affacta  ( 

(add  (aida-ap'-for-aachining  <dia>  <aida>))))) 

(SIDE-UP-FOI-IUCHIItia-HEiaHT 
(paraM  (<aida>)) 

(praconda  (and 
(aaM  <dia>  HEIOHT) 

(or  (aaM  <aido>  SIDED 
(aaM  <aida>  SIDE4)))) 

(affacta  ( 

(add  (alda-up-for-aachlnlng  <dim>  <sida>))))) 

(SIDE-UP-FOK-IUCHIlIia-DIAHETER 
(paraaa  (<alda>)) 

(praconda  (and 

(aaaa  <dia>  DIAHETEI) 

(or  (and 

(ahapa-of  <part>  RECTTAIGULAR) 

(aaaa  <aida>  SIDED) 

(and 

(ahapa-of  <part>  CTLIIDBICAL) 

(aaM  <aido>  SIDEO))))) 

(affacta  ( 

(add  (alda-ap-for-aachining  <dia>  <sida>))))) 


;  infaranca  rnlaa  for  availability 

(NA(n<IIE-AFAIUBLE 
(paraM  (<aachina>)) 

(praconda  (and 

(ia-of-typa  <aachina>  NACNIIE) 

('  (axiata  (<othar-part>) 

(on-tabla  <aachina>  <othar-part>) ) )) ) 


(affacta  ( 

(add  (ia-availabla-aachina  <aachina>))))) 

(TQOL-HOLDER-AVAIUBLE 
(paraM  (<aachina>)) 

(praconda  (and 

(ia-of-typa  <Bachina>  HACHIIE) 

(*  (axiata  (<tool>) 

(holding-tool  <aachina>  <tool>))))) 
(affacta  ( 

(add  (ia-availabla-tool-holdar  <aachina>) ) ) ) ) 

(TOOL-AVAILABLE 
(paraM  (<tool>)) 

(praconda  (and 

(ia-of-typa  <tool>  NACHIIE-TOOL) 

('  (axiata  (<Bachina>) 

(holding-tool  <aachina>  <tool>))))) 
(affacta  (  (add  (ia-availabla-tool  <tool>))))) 

(TABLE-AVAILABLE 
(paraM  (<Bachina>)) 

(praconda  (and 

(ia-of-typa  <Bachina>  HACHIIE) 

(ia-of-typa  <holding-dav>  HOLDIIO-DEVICE) 

(or 

('  (axiata  (<anothar-holding-davica>) 
(haa-davica  <aachina> 

<anothar-holding-davice> ) ) ) 
(ia-a  <holding-dav>  TOE-CUNP)))) 

(affacta  ( 

(add  (ia-availabla-tabla  <Bachina> 

<holding-dav> ) ) ) ) ) 

(HOLDIIG-DEVICE-AVAILABLE 
(paraaa  (<aachina>  <holding-dav>)) 

(praconda  (and 

(ia-of-typa  <holding-dav>  HOLDIIG-DEVICE) 

(*  (axiata  (<Bachina>) 

(haa-davica  <Bachina>  <holding-dav>))))) 
(affacta  ( 

(add  (ia-availabla-holding-davica 

<holding-dav> ) ) ) ) ) 

(PABT-AVAILABLE 
(paraaa  (<part>)) 

(praconda  (and 

(is-a  <part>  PART) 

('  (axiata  (<nachina>) 

(holding-vaakly  <Bachina>  <*holding-dav> 
<part>  <*aida>))) 

('  (axiata  (<aachina» 

(holding  <Bachina>  <aanothar-holding-dav> 
<part>  <*8ida>))))) 

(affacta  ( 

(add  (ia-availabla-part  <part>))))) 

(KOLDIIG-DEVICE-EHPTY 
(paraM  (<aachina>  <holding-dav>) ) 

(praconda  (and 
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(is-of-typ«  <MchiBa>  UCHUE) 

(ia-of-typ«  <holdiBg-d«v>  HOLOIIO-DEVICE) 

(*  <«xiatB  (<pBrt>) 

(holdlBg-aaakly  <mchiBa>  <holdiBg-dav> 
<pBrt>  <aida>))) 

('  (axiata  (<4aothar-pBrt>) 

(holdiag  <MchiBa>  <holding-dav> 

<aBothar-pBrt>  <alda>))))) 

(affacta  ( 

(add  (ia-aapty-holdiag-daaiea  <holding-dav> 

<Bachina>))))) 


:  iBfaraBca  rulaa  for  ahapa 

(IS-BECTAiaUUUl 
(paraaa  (<part>)) 

(pracoBda  (aBd 

(ia-a  <part>  PART) 

(aixa-of  <part>  LEfOTR  <1>) 

(aixa-of  <part>  HIOTH  <a» 

(aixa-of  <part>  HEIOHT  <h>))) 

(affacta  ( 

(add  (ahapa-of  <part>  RECTAIOULAR))))) 

(IS-CYLIIDRICAL 
(paraau  (<part>)) 

(pracoBda  (aad 

(ia-a  <part>  PART) 

(aixa-of  <part>  LEIQTH  <1» 

(aixa-of  <part>  DIAHETER  <d>))) 
(affacta  ( 

(add  (ahapa-of  <part>  CTLIIDRICAL) ) ) ) ) 


:  iBfaraoca  rulaa  for  aurfaca  fioiah 


( IS-NACHIIED-SURF ACE-QUALITY 
(paraaa  (<part>  <aida>)) 

(praeoBda  (aad 

(ia-a  <part>  PART) 

(or 

(anrfaca-fiBiah-aida  <part>  <aida>  ROUGH-HILL) 
(anrfaca-fioiah-aida  <part>  <aida>  ROUGM-TURI) 
(aurfaca-fiBiah-aida  <part>  <aida>  ROUGH-SHAPED) 
(anrfaca-fiBiah-aida  <part>  <aida>  ROUGH-PLAIED) 
(aurfaca-finiah-aida  <part>  <aida>  FIIISH-PLAIED) 
(aurfaca-finiah-aida  <part>  <aida>  COLD-ROLLED) 
(anrfaca-fiBiah-aida  <part>  <aida>  FIII3H-HILL) 
(anrfaca-finiah-aida  <part>  <aida>  FIHISH-TURI) 
(anrfaca-finiah-quality-aida  <part>  <aida> 

GROUHD)))) 

(affacta  ( 

(add  (aurfaca-finiah-qnality-aida  <part>  <aida> 

HACHIIED))))) 


(IS-GROUHD- SURFACE-QUALITY 
(paraaa  (<part>  <aida>)) 

(pracoBda  (and 

(ia-a  <part>  PART) 

(or 

(aurfaca-finiah-aida  <part>  <aida>  ROUGH-GRIID) 
(aurfaca-finiah-aida  <part>  <aida> 

FIIISH-GRIID)))) 

(affacta  ( 

(add  (aurfaca-finiah-quality-aida  <part>  <aida> 

GROUHD))))) 


(ARE-SIDES-OF-RECTAIGUUR-PART 
(paraau  (<part>)) 

(praconda  (and 

(ia-a  <part>  PART) 

(ahapa-of  <part>  RECTAIGULAR) ) ) 
(affacta  ( 

(add  (aida-of  <part>  SIDED) 
(add  (aida-of  <part>  SIDES)) 
(add  (aida-of  <part>  SIDES)) 
(add  (aida-of  <part>  SIDE4)) 
(add  (aida-of  <part>  SIDES)) 
(add  (aida-of  <part>  SIDE6))))) 


(HAS-SURFACE-FIIISH-RECTAIGULAR-PART 
(paraaa  (<part>)) 

(praconda  (and 

(ia-a  <part>  PART) 

(ahapa-of  <part>  RECTAIGULAR) 
(aurfaca-finiah-aida  <part>  SIDEl 
(aurfaca-finiah-aida  <part>  SIDES 
(aurfaca-finiah-aida  <part>  SIDES 
(aurfaca-finiah-aida  <part>  SIDE4 
(aurfaca-finiah-aida  <part>  SIDES 
(aurfaca-finiah-aida  <part>  SIDE6 
(affacta  ( 


Caurf aca-f iniah> ) 
<aurf aca-f ini8h> ) 
Caurf aca-f iniah> ) 
Caurf aca-f ini8h> ) 
Caurf aca-f ini8h> ) 
c  anrf aca-f iniah> ) ) ) 


(add  (anrfaca-finiah  Cpart>  Caurfaca-finiah>))))) 


(ARE-SIDES-OF-CYLIIDRICAL-PART 
(paraM  (Cpart>)) 

(praconda  (and 

(ia-a  Cpart>  PAST) 

(ahapa-of  Cpart>  CTLIHDRICAL) ) ) 
(affacta  ( 

(add  (aida-of  Cpart>  SIDEO)) 
(add  (aida-of  Cpart>  SIDES)) 
(add  (aida-of  Cpart>  SIDES))))) 


(KAS-SURFACE-FIHISH-CYLIIDRICAL-PART 
(paraaa  (cpart>)) 

(praconda  (and 

(ia-a  Cpart>  PART) 

(ahapa-of  Cpart>  CTLIIDRICAL) 

(aurfaca-finiah-aida  Cpart>  SIDEO  Caurf aca-f inish>) 
(aurfaca-finiah-aida  Cpart>  SIDES  Canrfaca-f iniah>) 
(aurfaca-finiah-aida  Cpart>  SIDES  Caurfaca-finiah>)) ) 
(affacta  ( 

(add  (anrfaca-finiah  Cpart>  Caurfaca-fiBiah>))))) 


(HAVE-SURFACE-FIIISH-RECTAIGULAR-PART-SIDES 
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(paruu  (<part>)) 

(pracond*  (and 

(!•-*  <part>  PAM) 

(shap«-of  <p«rt>  EECTAIOULAE) 
(■nrfaca-f iniah  <part>  <tttrf-fin>))) 
(affacta  ( 

(add  (aarfaca-finiah-alda  <part>  SIOBl 
(add  (aarfaca-fiaiah-aida  <part>  SIDE2 
(add  (aarfaca-f iniah-aida  <part>  SIDB3 
(add  (aarfaca-fiaiah-aida  <part>  SIDE4 
(add  (aarfaca-fiaiah-aida  <part>  SIDES 
(add  (aarfaca-fiaiah-aida  <part>  SIDE6 


-fia») 

-fiB>)) 

-fia>)) 

-fia>)) 

-fia>)) 

-fiB>)) 


(HAVE-SUEFA<S-FIIISH-CTLimiCAL-PART-SIDES 
(paraaw  (<part>)) 

(pracoada  (aad 

(ia-a  <part>  PAM) 

(ahapa-of  <part>  CTLIIDMCAL) 

(aurfaca-f iaiah  <part>  <aarf-fin>))) 

(affacta  ( 

(add  (aarfaca-fiaiah-aida  <part>  SIDED  <aurf-fin>)) 

(add  (aarfaca-fiaiah-aida  <part>  SIDES  <aarf-fiB>)) 

(add  (aarfaca-fiaiah-aida  <part>  SIDES  <aurf-fin>)) 


(aarfaca-coatiag  <part>  <aurf-coat>))) 

(affacta  ( 

(add  (aurfaca-coatiag-aida  <part>  SIDEl  <aarf-coat>) ) 
(add  (aarfaca-coatiag-aida  <part>  SIDES  <aarf-coat>)) 
(add  (aarfaca-coatiag-aida  <part>  SIDES  <aarf-coat>) ) 
(add  (aarfaca-coatiag-aida  <part>  SI0E4  <aurf-coat>) ) 
(add  (aarfaca-coatiag-aida  <part>  SIDES  <aarf-coat>)) 
(add  (aarfaca-coatiag-aida  <part>  SIDES  <aurf-coat>))))) 

(HAVE-SURFACS-COATIIQ-CYLIIDRICAL-PART-SIDES 
(parau  (<part>)) 

)X^racoBda  (aad 

(ia-a  <part>  PART) 

(ahapa-of  <part>  CYLIIDRICAL) 

(aarfaca-coatiag  <part>  <aarf-coat>))) 

(affacta  ( 

(add  (aarfaca-coatiag-aida  <part>  SIDED  <aarf-coat>) ) 
(add  (aurfaca-coatiag-aida  <part>  SIDES  <aurf-coat>) ) 
(add  (aarfaca-coatiag-aida  <part>  SIDES  <aarf-coat>) ) ) ) ) 


iafaraaca  rulaa  for  aarfaca-coating 


(HAS-SURFACB-COATIia-RBC:TAlam.AR-PART 
(paraaa  (<part>)) 

(pracoada  (aad 

(ia-a  <part>  PART) 

(ahapa-of  <part>  RECTAIOULAR) 
(aurfaca-coatiag-aida  <part>  SIDEl 
(aarfaca-coatiag-aida  <part>  SIDES 
(aarfaca-coatiag-aida  <part>  SIDES 
(aarfaca-coatiag-aida  <part>  SIDE4 
(aarfaca-coatiag-aida  <part>  SIDES 
(aarfaca-coatiag-aida  <part>  SIDES 
(affacta  ( 

(add  (aarfaca-coatiag  <part>  <aurf 


<aurf-coat>) 
<aurf-coat>) 
<aurf-coat>) 
<aurf-coat>) 
<aurf-coat>) 
<8urf-coat>) )) 

-coat>))))) 


iXMATERIAL-FERROUS 
(paraaa  (<part>)) 

(prc'-nda  (aad 

(ia-a  <part>  PART) 

(or 

(■atorial-of  <part>  STEEL) 
(■atarial-of  <part>  IRDR)))) 
(affacta  ( 

(add  (alloy-of  <part>  FERROUS))))) 

(HATERIAL-IOl-FERROUS 
(paraaa  (<part>)) 

(pracoada  (aad 

(ia-a  <part>  PART) 

(or 

(aatarial-of  <part>  BRASS) 
(aatorial-of  <part>  COPPER) 
(aatarial-of  <part>  BROIZE)))) 
(affacta  ( 

(add  (alloy-of  <part>  lOI-FERROUS))))) 


(HAS-SURFACE-COATIBG-CTLIIDRICAL-PART 
(paraaa  (<part>)) 

(pracoada  (and 

(ia-a  <part>  PART) 

(ahapa-of  <part>  CYLIIDRICAL) 
(aarfaca-coatiag-aida  <part>  SIDEO  <8arf-coat>> 
(aarfaca-coatiag-aida  <part>  SIDES  <8urf-caat>> 
(aurfaca-coatiag-aida  <part>  SIDES  <8urf-coat>> )) 
(affacta  ( 

(add  (aarfaca-coatiag  <part>  <aurf-coat>) )) ) ) 


(HAVE-SURFACE-COATHO-RECTAMUUR-PAHT-SIDES 
(paraaa  (<part>)) 

(pracoada  (and 

(ia-a  <part>  PART) 

(ahapo-of  <part>  RECTAIOULAR) 


(HARDIESS-OF-HATERIAL-SOFT 
(paraaw  (<part>)) 

(pracoada  (and 

(ia-a  <part>  PART) 

(or 

(aatarial-of  <part>  ALUHIIUR) 
(alloy-of  <part>  lOI-FERROUS)))) 
(affacta  ( 

(add  (hardnaaa-of  <part>  SOFT))))) 

(HAROIESS-OF-RATERIAL-MARD 
(paraaa  «part>)) 

(pracoada  (and 

(ia-a  <part>  PART) 

(alloy-of  <part>  FERROUS))) 

(affacta  ( 

(add  (hardnaaa-of  <part>  HARO))))) 
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(HiaH-HELTIM-POIlT 
(paraaa  (<«ir«>>) 

(praconda  (and 

(ia-a  <Bira>  SPUTIIO-RBTAL-VIRE) 

(or 

(■atarial-of-aira  <wira>  TUIQSTEI) 
(■atarial-of-aira  <alra>  HOLYBDEIUN)))) 
(affoeta  ( 

(add  (haa-high-aolting~point  <Bira>))))) 


;  infaranca  rnlaa  for  typaa 

(IS-HACXIIE 
(paraaw  (<atachina>)) 

(praconda 

(or 

(ia-a  <aachina>  DtIU.) 

(ia-a  <Bachina>  LATHE) 

(ia-a  <aaehina>  SHAPED 
(ia-a  <nachina>  PLAHED 
(ia-a  <MChina>  dBIHOEH) 

(ia-a  <Bachino>  BAHD-SAH) 

(ia-a  <Bachina>  CItCULAH-SAH) 

(ia-a  <Bachina>  HIUIBO-HACHIHE) 

(ia-of-typa  <BachiBa>  HELDEft))) 

(affacta  ( 

(add  (ia-of-typa  <BaetLiaa>  HACHIHE))))) 

(IS-HELOEB 
(paraBa  (<BachiBa>)) 

(praconda 

(or 

(ia-a  <Bachina>  HETAL-ABC-UELDER) 

(ia-a  <aachina>  GAS-HELDER) ) ) 

(affacta  ( 

(add  (ia-of-typa  <Bachina>  WELDER))))) 

(IS-TOOL 

(parau  (<tool>)) 

(praconda 

(or 

(ia-of-typa  <tool>  HACHIIE-TOOL) 

(ia-of-typa  <tool>  OPEMTOR-TOOL) ) ) 

(affacta  ( 

(add  (ia-of-typa  <tool>  TOOL))))) 

(IS-HACHIIE-TOOL 
(paraBa  (<attachBant>)) 

(praconda 

(or 

(ia-of-typa  <attacliBant>  DRILL-BIT) 
(ia-of-typa  <attachBant>  LATHE-TOOLBIT) 
(ia-of-typa  <attBChBant>  CUTTIIG-TOOL) 

(ia-a  <attacliBaBt>  GRIIDIIO-WHEEL) 

(ia-of-typa  <attachBant>  BAID-SAH-ATTACHREIT) 
(ia-of-typa  <attachBant> 

CIRCULAR-SAW- ATT ACHHEIT ) 


(ia-of-typa  <attachBant>  HILLIHO-CAITTER) 

(ia-a  <attachBant>  ELECTRODE))) 

(affacta  ( 

(add  (ia-of-typa  <attachBant>  HACHIHE-TOOL))))) 

(IS-DRILL-BIT 
(paraBa  (<drill-bit>)) 

(praconda 

(or 

(ia-a  <drill-bit>  SPOT-DRILL) 

(ia-a  <drill-bit>  CEHTER-DRILL) 

(ia-a  <drill-bit>  TWIST-DRILL) 

(ia-a  <drill-bit>  STRAIGHT-FLUTED-DRILL) 

(ia-a  <drill-bit>  HIGH-HELIX-DRILL) 

(ia-a  <drill-bit>  OIL-HOLE-DRILL) 

(ia-a  <drill-bit>  GUH-DRILL) 

(ia-a  <drill-bit>  CORE-DRILL) 

(ia-a  <drill-bit>  TAP) 

(ia-a  <drill-bit>  COUHTERSIHX) 

(ia-a  <drill-bit>  COUHTERBORE) 

(ia-a  <drill-bit>  REAHER))) 

(affacta  ( 

(add  (ia-of-typa  <drill-bit>  DRILL-BIT))))) 

(IS-UTHE-TOOLBIT 
(paraBo  (<toolbit>)) 

(praconda 

(or 

(ia-a  <toolbit>  ROUGH-TOOLBIT) 

(ia-a  <toolbit>  FIBISH-TOOLBIT) 

(ia-a  <toolbit>  V-THREAD) 

(ia-a  <toolbit>  UURD)) 

(affacta  ( 

(add  (ia-of-typa  <toolbit>  LATHE-TOOLBIT))))) 

(IS-CUTTIIG-TOOL 
(paraBa  (<cutting-tool>)) 

(praconda 

(or 

(ia-a  <cutting-tool>  ROUGHIHG-CUTTHO-TOOL) 
(ia-a  <cutting-tool>  FIHISHIIG-CUTTIHG-TOOL))) 
(affacta  ( 

(add  (ia-of-typa  <cutting-tool>  CUTTIHO-TOOL))))) 

(IS-CIRCUUR-SAW-ATTACHHEHT 
(paraBa  (<attachBant>) ) 

(praconda 

(or 

(ia-a  <attachBant>  COLD-SAW) 

(ia-a  <attachBant>  FRICTIOI-SAW))) 

(affacta  ( 

(add  (ia-of-typa  <attachBant> 

CIRCULAR-SAW-ATTACHHEHT) ) ) ) ) 


( IS-BAHD-SAW-ATTACHHEIT 
(paraBa  (<attachBant>) ) 

(praconda 

(or 

(ia-a  <attachBont>  SAW-BAHD) 
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(ia-a  <attaclUiaBt>  BAID-FILE))) 

(affacts  ( 

(add  (ia-of-typa  <attaehaMBt> 

BAID-SAH-iTTACHNEIT) ) ) ) ) 

(IS-HIlXIBa-CUTTEB 
(paraaa  (<BilliBg-cattar») 

(pracoBda 

(or 

(ia-a  <BilliBg-c«ttar>  PLAII-NILL) 

(ia-a  <BilliBg-cattar>  BID-NILL))) 

(affacta  ( 

(add  (ia-of-typa  <ailliBg-cattar> 

HILLIia-CUTTEK))))) 

(IS-OPEBATOB-TOOL 
(paraaw  (<tool>)) 

(pracoBda 

(or 

(ia-a  <tool>  UTHE-FILS) 

(ia-a  <toal>  ABBASIFE-CLOTH) 

(ia-a  <toal>  TatCH) 

(ia-a  <tool>  HELOIBO-BOO) 

(ia-a  <tool>  JPIATIIO-HETAL-UIRE) 

(ia-a  <tool>  BBUSN))) 

(affacta  ( 

(add  (ia-of-typa  <tool>  OPERATOR-TOOL))))) 

(IS-C«TTIia-FUJIO 
(paraaM  (<eattiog-flaid») 

(praeoBda 

(or 

(ia-a  <cattiBg'flaid>  SOLUBLE-OIL) 

(ia-a  <catting-flaid>  DIIERAL-OIL))) 

(affacta  ( 

(add  (ia-of-typa  <cattiBg-fluid>  CUTTIIG-FLUID))))) 

(IS-ROLOIIO-DEVICE 
(paraaa  (<holding-dav>) ) 

(pracoBda 

(or 

(ia-a  <holdiBg-dar>  T-BLOCK) 

(ia-a  <holdiBg-daa>  VISE) 

(ia-a  <holdiBg-dav>  TOE-CLAHP) 

(ia-a  <holdiBg-dav>  CEBTSRS) 

(ia-a  <holdiog-dav>  4-JAU-CHUCK) 

(ia-a  <holdiBg-daT>  COLLET-CHUCE) 

(ia-a  <holdiBg'dav>  HAOBETIC-CHUCI)) ) 

(affacta  ( 

(add  (ia-of-typa  <holdiBg-daT>  HOLDIIQ-DEVICE))))) 


B.2.3  Functions 

(dafuB  oaM  (x  y) 

(coBd  ((ia-*ariabla  x) 

(ratara-biBdiBg  x  y)) 
((ia-rariabla  y) 
(ratarB-biadiag  y  x)) 


(t 

(aqnal  x  y)))) 

(dafBB  half-of  (x  y) 

(coBd  ((ia-variabla  x) 

’BO-aatch-attaaiptad) 

((ia-variabla  y) 

(ratum-biadiBg  y  (/  x  2))) 

((-  (/  X  2)  y)  t  ))) 

(dafuB  aaallar  (x  y> 

(coad  ((ia-variabla  x) 

(if  (>  (-  y  .5)  0) 

(raturB-biBdiBg  x  (-  y  .5)))) 
((ia-variabla  y) 

(raturB-biading  y  (ax  .5))) 

(«  X  y)  t  ))) 

(dafuB  aBallar-thaB-2iB  (x  y) 

(coad  ((ia-variabla  x) 

*Bo-Batch-attaBptad) 

((ia-variabla  y) 
’no-aatch-attaaiptad) 

(t 

«-  (-  X  y)  2)))) 


;  Function  uaad  for  finiah  oparationa. 

(dafun  finiahing-aixa  (x  y) 

(coad  ((and  (ia-variabla  x) 

(ia-variabla  y)) 

’Bo-aatcb-attaaiptad) 

((ia-variabla  x> 

(raturn-binding  x  (.*  j  0.002))) 

((ia-variabla  y) 

(if  (>  (-  X  0.002)  0) 

(ratura-binding  y  (-  x  0.002)))) 

(t 

(<>•  (aba  (-  X  y))  0.003)))) 

;  Fuactiona  for  ganarating  naa  valuaa  ahan  tao 
parta  ara  aaldad  togathar. 

(dafun  Ban-aixa  (dl  d2  d) 

(cond  ((ia-variabla  dl) 

’no-aatch-attaaiptad) 

((ia-variabla  d2> 

’no-aatch-attanptad) 

((ia-variabla  d) 

(raturn-binding  d  (a  dl  d2))) 

(t 

(a  d  (♦  dl  d2))))) 

(dafun  naa-part  (part  parti  part!) 

(cond  ((ia-variabla  part) 

(raturn-binding  part  (nav-naaa  parti  part2))) 
(t))) 

(dafun  naa-Batarial  (aatarial  Batariall  Batariall) 
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(if  (ia-vari«bl«  Mtarial) 

(cond  ((••••  Mtariali  MtarlalS) 

(ratnrn-bindiag  MCarial  utariall)) 

<t 

( raturn-blnding 

Mtarial 

(b««-bum  Mtariall  BBtariBlS)))) 

t)) 

(dafBB  BBB-BBBB  (bBBbI  BBM]) 

(iBtarB  (coBcatBBBt*  ’atrlBg 

(ayabol-BaaM  nasal > 

lt«W 

(ayaibal-naiM  naBa2)>)) 


;  Ratnra  a  PROOIOT  biadiBg:  variabla  var  la  bound 
to  valua  val. 

(dafnn  ratarn-binding  (var  val) 

(Hat  (Hat  (Hat  var  val)))) 
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B.3  Incomplete  Domains 

The  16  preconditions  missing  in  following: 


operator 

precondition 

drill-with-high-helix-drill 

(Ixolding-tool  <machine>  <drill-bit>) 

drill-with-gun-drill 

(has-spot  <part>  <hole>  <side>  <loc-x>  <loc-y>) 

drill- with-center-drill 

(has-spot  <part>  <hole>  <sidfl>  <loc-x>  <loc-y>) 

tap 

(holding-tool  <nachino>  <drill-bit>) 

tap 

(ia-claan  <part>) 

counterbore 

(holding-tool  <machine>  <drill-bit>) 

ream 

(has-fluid  <machina>  <fluid>  <paxi;>) 

drill- with-twist-drill-in-milling-machine 

(holding-tool  <machin«>  <drill-bit>) 

ma  ke- knurl- wi  t  h- lat  he 

(holding-tool  <machine>  <toolbit>) 

make-knurl- with-lathe 

(■  (has-burra  <part>)) 

finish-shape 

(*  (has-burrs  <part>)) 

cut- with-circular- friction-saw 

(holding-tool  <machine>  <attachBant>) 

cut-with-band-saw 

(■  (has-burrs  <part>)) 

hold-with-v-block 

(on-tabls  <nachins>  <part>) 

hold-with-centers 

(on-tabls  <niachins>  <part>) 

hold-with-magnetic-chuck 

(■  (has-burrs  <part>)) 

The  44  preconditions  missing  in  are  the  following: 
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operator 


drill-with-twist-drill 


drill-with-high-helix-drill 


drill-with-high-helix-drill 


drill- with-strajght-fluted-drill 


drill- with-oil-hole-drill 


drill- with-gun-drill 


tap 


countersink 


countersink 


counterbore 


counterbore 


side-mill 


finish-turn 


make-thread- with-lathe 


make-knurl-with-lathe 


make-knurl- with-lathe 


polish-with-lathe 


finish-shape 


finish-shape 


finish-shape-with-planer 


rough-grind-with-hard-wheel 


rough-grind-with-hard- wheel 


rough-grind-with-hard-wheel 


rough-grind- with-soft- wheel 


rough-grind- with-soft-wheel 


finish-grind- wit  h-hard- wheel 


finish-grind- with-hard-vvheel 


finish-grind- with-soft- wheel 


finish-grind- with-soft- wheel 


cut-with-circular-friction-saw 


polish- with-band-saw 


metal-spray-coating- wear- resistant 


metal-spray-coating- wear- resistant 


metal-spray-coating- wear-resistant 


metal-spray-prepare 


hold-with-v-block 


hold-with-v-block 


hold-with-toe-clamp 


secure- with- toe-clamp 


hold-with-centers 


hold-with-centers 


hold-with-collet-chiirk 


hold-with-collet-chuck 


hold- with- magnetic-chuck 


precondition 


(holding-tool  <niachine>  <drill-bit>) 


(has-lluid  <iiiachine>  <iluid>  <part>) 


(holding-tool  <machine>  <drill-bit>) 


(has-spot  <part>  <hole>  <side>  <loc-x>  <loc-y>) 


(has-lluid  <iBachine>  <fluid>  <part>) 


(holding-tool  <machin«>  <drill-bit>) 


(holding-tool  <machin«>  <drill-bit>) 


(has-hol*  <part>  <hol*>  <sida>  <d«pth>  <diaa>  <loc-x>  <lac-y>) 


(■  (has-burrs  <pau:t>)) 


(holding-tool  <machine>  <drill-bit>) 


('  (haa-burrs  <part>)) 


(holding-tool  <machine>  <milling-cutter>) 


(is-cl«an  <part>) 


(is-clean  <part>) 


(holding-tool  <machine>  <toolbit>) 


(is-claan  <pau:t>) 


(matarial-of-abrasive-cloth  <cloth>  EMERY) 


(holding-tool  <machine>  <cutting-tool>) 


(■  (has-burrs  <part>)) 


(holding-tool  <machine>  <cutting-tool>) 


('  (mat«rial-of  <pau:t>  BROHZE)) 


(*  (matsrial-of  <part>  COPPER)) 


(holding-tool  <Bachine>  <Hheel>) 


(hardnsss-of -wheel  <wheel>  SOFT) 


(grit-of-wheel  <wheel>  COARSE-GRIT) 


(has-fluid  <machine>  <fluid>  <part>) 


(grit-oY-wheel  <uheel>  FINE-GRIT) 


(grit-ol-wheel  <wheel>  FINE-GRIT) 


(is-clean  <part>) 


(holding-tool  <Bachine>  <attachinent>) 


(is-clean  <part>) 


(*  (material-oi-wire  <wire>  TUNGSTEN)) 


('  (material-of-wire  <wire>  MOLYBDENUM)) 


(is-clean  <part>) 


(is-clean  <part>) 


(is-clean  <peurt>) 


(on-table  <machine>  <part>) 


(is-clean  <pcurt>) 


(is-clean  <part>) 


(*  (has-burrs  <part>)) 


(on-table  <machine>  <part>) 


(has-device  <iiiachine>  <holding-device>) 


('  (has-burrs  <part>)) 


(is-clean  <part>) 
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B.4  Problem  Sets 

The  problems  used  to  train  and  test  EXPO  were  generated  randomly  as  follows.  A 
random  number  of  goals  is  chosen  between  1  and  9.  The  goads  are  chosen  from  a  list  of 
machining  goals  that  include  size,  surface  finish,  surface  coating,  and  holes.  Then  a  start 
state  is  generated  from  a  machine  shop  description  that  contains  a  set  of  machines,  tools, 
holding  devices,  and  raw  materials.  The  solutions  of  the  problems  average  one  hundred 
steps. 

EXPO  was  tested  with  two  different  training  sets  of  100  problems  each.  Two  test  sets 
of  20  problems  each  were  used. 


B.5  Tables  of  Results 

This  section  presents  the  numerical  results  that  were  used  for  the  graphs  in  Chapter  6. 


B.5.1  Missing  10%  of  the  Preconditions 

The  following  tables  show  the  numerical  results  that  are  summarized  in  Figure  6.3  (10% 
incompleteness); 


number  of 
training  problems 

failures  found 

training  set  1 

training  set  2 

0 

0 

0 

10 

6 

7 

20 

6 

7 

30 

9 

7 

40 

9 

10 

50 

9 

10 

60 

0 

0 

70 

0 

0 

80 

0 

0 

90 

0 

0 

100 

0 

0 

B.5.  TABLES  OF  RESULTS 
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number  of 
training  problems 

successfully  executed  solutions 

training  set  1 

training  set  2 

test  set  1 

test  set  2 

test  set  1 

test  set  2 

0 

5 

5 

5 

6 

10 

15 

15 

17 

19 

20 

15 

15 

17 

19 

30 

15 

15 

18 

20 

40 

15 

20 

20 

20 

50 

15 

20 

20 

20 

60 

15 

20 

20 

20 

70 

15 

20 

20 

20 

80 

15 

20 

20 

20 

90 

15 

20 

20 

20 

100 

15 

20 

20 

20 

New  preconditions  for  were  learned  by  EXPO  with  the  first  training  set  in  the 

following  order: 

1.  ("  (has-burrs  <part>) )  of  operator  cut- with-band-saw 

2.  (holding-tool  <machine>  <drill-bit>)  of  operator  drill-with-high-helix-drill 

3.  (holding-tool  <machine>  <drill-bit>)  of  operator  tap 

4.  (is-clean  <part>)  of  operator  tap 

5.  (has-fluid  <inachine>  <f  luid>  <part>)  of  operator  ream 

6.  (holding-tool  <machine>  <attachnient>)  of  operator  cut-with-circular-friction- 
saw 

7.  (on-table  <machine>  <part>)  of  operator  hold-with-v-block 

8.  ("  (has-burrs  <part>) )  of  operator  hold-with-magnetic-chuck 

9.  (”  (has-burrs  <part>) )  of  operator  finish-shape 

New  preconditions  for  f^precio  were  learned  by  EXPO  with  the  second  training  set  in 
the  following  order: 

1.  (holding-tool  <machine>  <drill-bit>)  of  operator  drill-with-high-helix-drill 
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2.  (holding-tool  <machine>  <drill-bit>)  of  operator  tap 

3.  (is-clean  <part>)  of  operator  tap 

4.  (holding-tool  <machine>  <drill-bit>)  of  operator  counterbore 

5.  (has-f  luid  <machine>  <f  luid>  <part>)  of  operator  ream 

6.  (holding-tool  <machine>  <attachment>)  of  operator  cut-with-circular-friction- 
saw 

7.  ("  (has-burrs  <part>) )  of  operator  cut- with-band-saw 

8.  (on-table  <machine>  <part>)  of  operator  hold-with-v-block 

9.  ('  (has-burrs  <part>))  of  operator  hoid-with-magnetic-chuck 

10.  ("  (has-burrs  <part>) )  of  operator  finish-shape 

B.5,2  Missing  30%  of  the  Preconditions 

The  following  tables  show  the  numerical  results  that  are  summarized  in  Figure  6.4  (30% 
incompleteness): 


number  of 
training  problems 

failures  found 

training  set  1 

training  set  2 

0 

0 

0 

10 

19 

16 

20 

o 

2 

30 

4 

4 

40 

2 

5 

50 

0 

4 

60 

1 

0 

70 

0 

0 

80 

0 

1 

90 

0 

0 

100 

0 

0 

B.5.  TABLES  OF  RESULTS 
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number  of 
training  problems 

successfully  executed  solutions 

training  set  1 

training  set  2 

test  set  1 

test  set  2 

test  set  1 

test  set  2 

0 

1 

2 

1 

1 

10 

3 

13 

2 

14 

20 

9 

15 

7 

14 

30 

9 

18 

8 

15 

40 

11 

18 

13 

17 

50 

19 

18 

17 

17 

60 

19 

18 

17 

17 

70 

19 

18 

17 

17 

80 

19 

19 

17 

19 

90 

19 

19 

17 

19 

100 

19 

19 

17 

19 

New  preconditions  for  learned  by  EXPO  with  the  first  training  set  in  the 

following  order: 

1.  (is-cleam  <part>)  of  operator  polish-with-band-saw 

2.  (is-clean  <part>)  of  operator  hold-with-toe-clamp 

3.  (holding-tool  <machine>  <drill-bit^')  of  operator  drill- with- twist-drill 

4.  (has-fluid  <machine>  <f  luid>  <part>)  of  operator  drill-with-high-helix-drill 

5.  (holding-tool  <machine>  <drill-bit>)  of  operator  drill-with-high-helix-drill 

6.  (has-hole  <part>  <hole>  <side>  <depth>  <diaia>  <loc-x>  <loc-y>)  of  oper¬ 
ator  countersink 

7.  ('  (has-burrs  <part>) )  of  operator  countersink 

8.  (holding-tool  <machine>  <drill-bit>)  of  operator  counterbore 

9.  (“  (has-burrs  <part>) )  of  operator  counterbore 

10.  (hardness-of-wheel  <wheel>  SOFT)  of  operator  rough-grind-with-soft-wheel 

11.  (is-clean  <part>)  of  operator  secure-with-toe-clamp 

12.  (holding-tool  <machine>  <drill-bit>)  of  operator  tap 
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13.  (holding-tool  <machine>  <attachment>)  of  operator  cut-with-circular-friction- 
saw 

14.  (holding-tool  <machine>  <wheel>)  of  operator  rough-grind-with-hard-wheel 

15.  (is-cleem  <part>)  of  operator  metal-spray-prepare 

16.  (has-fluid  <machine>  <fluid>  <part>)  of  operator  drill-with-oil-hole-drill 

17.  (is-cleain  <part>)  of  operator  hold-with-v-block 

18.  (on-table  <machine>  <part>)  of  operator  hold-with-v-block 

19.  (is-cleam  <part>)  of  operator  hold-with-magnetic-chuck 

20.  (holding-tool  <machine>  <cutting-tool>)  of  operator  finish-shape 

21.  ("  (has-burrs  <part>) )  of  operator  finish-shape 

22.  (grit-of-wheel  <wheel>  FINE-GRIT)  of  operator  finish-grind-with-soft-wheel 

23.  (is-clean  <part>)  of  operator  finish-grind-with-soft-wheel 

24.  (holding-tool  <machine>  <milling-cutter>)  of  operator  side-mill 

New  preconditions  for  f)prec3o  were  learned  by  EXPO  with  the  second  training  set  in 
the  following  order; 

1.  (holding-tool  <raachine>  <drill-bit>)  of  operator  drill-with-twist-drill 

2.  (has-fluid  <machine>  <f luid>  <part>)  of  operator  drill-with-high-helix-drill 

3.  (holding-tool  <machine>  <drill-bit>)  of  operator  tap 

4.  (holding-tool  <raachine>  <drill-bit>)  of  operator  drill-with-high-helix-drill 

5.  (holding-tool  <machine>  <drill-bit>)  of  operator  counterbore 

6.  ("  (has-burrs  <part>) )  of  operator  counterbore 

7.  (is-clean  <part>)  of  operator  metal-spray-prepare 

8.  (has-fluid  <machine>  <fluid>  <part>)  of  operator  finish-grind- with-hard-wheel 

9.  (grit-of-wheel  <wheel>  FINE-GRIT)  of  operator  finish-grind-with-hard-wheel 
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10.  (holding-tool  <machine>  <wheel>)  of  operator  rough-grind-with-hard-wheel 

11.  (is-clean  <part>)  of  operator  hold-with-toe-clamp 

12.  (has-hole  <paxt>  <hole>  <side>  <depth>  <diam>  <loc-x>  <loc-y>) of  oper¬ 
ator  countersink 

13.  C"  (has-burrs  <part>) )  of  operator  countersink 

14.  (is-cleam  <part>)  of  operator  secure-with-toe-clamp 

15.  (holding-tool  <machine>  <attachinent>)  of  operator  cut-with-circular-friction- 
saw 

16.  (”  (material-of-wire  <wire>  TUNGSTEN)  )  of  operator  metal-spray-coating-wear- 
resistant 

17.  (“  (material-of-wire  <wire>  MOLYBDENUM))  of  operator  metal-spray-coating- 
wear-resistant 

18.  (is-clean  <part>)  of  operator  metal-spray-coating-wear-resistant 

19.  (has-fluid  <machine>  <fluid>  <part>)  of  operator  drill-with-oil-hole-drill 

20.  (is-clean  <part>)  of  operator  hold-with-v-block 

21.  (on-table  <machine>  <part>)  of  operator  hold-with-v-block 

22.  (is-clean  <part>)  of  operator  hold-with-magnetic-chuck 

23.  (holding-tool  <machine>  <cutting-tool>)  of  operator  finish-shape 

24.  ("  (has-burrs  <part>) )  of  operator  finish-shape 

25.  (grit-of-wheel  <wheel>  FINE-GRIT)  of  operator  finish-grind-with-soft-wheel 

26.  (is-clean  <part>)  of  operator  finish-grind-with-soft-wheel 

27.  (holding-tool  <machine>  <milling-cutter>)  of  operator  side-mill 
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Appendix  C 

expo’s  Implementation  of 
Experimentation  Policies 


The  control  rules  below  implement  EXPO’s  experimentation  strategies  for  PRODIGY,  as 
described  in  Section  4.4.1.  The  meta  predicates  that  are  used  by  these  control  rules  are 
described  afterwards. 


C.l  Policies 


Search  Depth  and  Plan  Length 

(AVOID-DEEP-IODES 

(lha  (and  (priaary-candidata-noda  <noda>) 
(balon-azp-dapth-Xiait  <noda>))) 

(rhs  (rajact  noda  <noda>))) 

(AVOID-LOIO-PLUS 

(lha  (and  (priaary-candidata-noda  <nQda>> 
(cnrrant-plan  <noda>  <plan>> 
(ia~too-long-plan  <plan>))) 

(rha  (rajact  noda  <noda>))) 

(PREFER-SHORT-PLAIS 
(priority  10) 

(lha  (and  (candidata-noda  <nodal>) 
(candidata-noda  <noda2>) 
(noda-praf-not~cachad  <nodal>  <noda2>> 
(cnrrant-plan  <nodal>  <pll>> 
(currant-plan  <noda2>  <pl2>) 

(ia-longar  <pl2>  <pH>))) 

(rha  (prafar  noda  <nodal>  <noda2>))) 

( PREFER-PLAIS-MITH-FEWER- STATE-CHARGES 


161 


162 APPENDIX  a  EXPO’S  IMPLEMENTATION  OF  EXPERIMENTATION  POLICIES 


(priority  10) 

(lha  (and  (candidata-noda  <nodal>) 

(candidata-noda  <noda2>) 
(neda'praf-not-cachad  <nodal>  <noda2>) 
(carrant-stata  <nodal>  <statal>) 
(currant -St at a  <noda2>  <stata2>) 
(has-faaar-changas  <statal>  <stata2>))) 
(rhs  (prafar  noda  <nodal>  <noda2>))) 


Goal  Interactions 


(  SUPPOtT-TOP-OOiL-COICOU 
(priority  10) 

(lha  (and  (candidata-noda  <nodal>) 

(candidata-noda  <noda2>) 

(noda-praf-not-cachad  <nodal>  <noda2>) 
(currant-goal  <nodal>  <goall>) 

(currant-goal  <noda2>  <goal2>) 
(doaa-top-goal-concord  <goall>) 
(not-doas-top-goal-concord  <goal2> ) ) ) 

(rha  (prafar  noda  <nodal>  <noda2>))) 

(*V0ID-T0P-PMTECTI01-VI0LATI0i 
(priority  10) 

(lha  (and  (candidata-noda  <nodal>) 

(candidata-noda  <noda2>) 

(noda-praf-not-cachad  <aodal>  <noda2>) 
(currant-goal  <nodal>  <goaIl>) 

(currant-goal  <nodo2>  <goal2>) 

(doas-top-protact ion-violation  <goal2>> 

( not-doas-t op-protact ion-violat ion  <goal 1> ) ) ) 
(rha  (prafar  noda  <nodal>  <noda2>))) 

(*»0ID-''0P-P1lE«EqUISITE-VI0LATI0i 
(priority  10) 

(lha  (and  (candidata-noda  <nodal>) 

(candidata-noda  <nada2>) 

(noda-praf-not-cachad  <nodal>  <noda2>) 
(currant-goal  <nodal>  <goall>) 

(currant-goal  <nodo2>  <goal2>) 
(doas-top-proraquisita-violation  <goal2>) 
(not-doas-top-praraquisita-violation  <goall>) ) ) 
(rrs  (prafar  noda  <nodal>  <noda2>))) 


Operators 


( REJECT- (SREVEISIBLE-OPS 

(lha  (and  (currsnt-noda  <noda>) 

(candidata-op  <noda>  <op>> 
(not-ia-ravarsibla  <op>))) 
(rha  (rajact  oparator  <op>))) 

(PREFER-OPS-HITH-FEHER-STATE-CHAIGES 
(priority  10) 

(lha  (and  (currant-noda  <noda>> 

(candidata-op  <noda>  <opl>) 
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(caadidata-op  <noda>  <op2>> 
(ara-affacts-of  <apl>  <affl>) 
(ara-affacts-of  <op3>  <aff3>) 
(is-longar  <aff2>  <affl>))) 

(rha  (prafar  oparator  <apl>  op3>))) 

(PtEFEK-IELIABLS-OPS 
(priority  10) 

(Ihs  (aad  (carraat-Boda  <noda>) 

(caadidata-op  <noda>  <opl>) 
(caadidata-op  <nada>  <ap3>) 
(ia-aora-raliabla  <opl>  <ap2>))) 
(rha  (prafar  oparator  <opl>  <op2>))) 


( PREFEB-UIRELl ABLS-0P3 
(priority  10) 

(lha  (aad  (corraat-noda  <noda>) 

(caadidata-op  <noda>  <opl>) 
(caadidata-op  <noda>  <ap2>> 
(aot-roliabla  <opl>))) 

(rha  (prafar  oparator  <opl>  <ap2>))) 


( PREFEt-REVEKS IBLE-OPS 
(priority  10) 

(lha  (aad  (corraat-noda  <noda>) 

(caadidata-op  <noda>  <opl>) 
(caadidata-op  <noda>  <op2>> 
(ia-ravaraibla  <opl>) 
(not-ia-ravaraibla  <op2>))) 
(rha  (prafar  oparator  <opl>  <op2>))) 


Binding  Interactiona 


(PREFER-RO-OBJS-VERY-HIQH-PROTECTIOI 
(priority  10) 

(lha  (and  (currant-noda  <noda>) 

(nao-caadidata-bindinga  <noda>  <binding-liat-l>) 
(nao-caadidata-bindinga  <noda>  <binding-liat-2>) 
(not-aqoal-liata  <binding-liat-l>  <binding-liat-2>) 
(haa-obja-uaad-vary-high-protaction  <binding-liat-2>) 
(not-haa-obja-uaad-vary-high-protaction  <binding-li8t-l>) ) ) 
(rha  (prafar  bindinga  <binding-llat-l>  <binding'liat-2>>) ) 

(PREFER-LEAST-OBJS-VERY-HIOH-PROTECTIOI 
(priority  10) 

(lha  (and  (currant-noda  <noda>) 

(naa-candidata-bindinga  <noda>  <binding-li8t-l>> 
(nao-caadidata-bindinga  <noda>  <binding-li8t-2>) 
(not-aqoal-liata  <binding-liat-l>  <binding-liat-2>) 
(nuar-obja-aaad-vary-high-protaction  <binding-liat-l>  <nl>) 
(noa-obja-naad-vary-high-protaction  <binding-li8t-2>  <n2>) 
(aaallar  <nl>  <n2>))) 

(rha  (prafar  bindinga  <binding-li8t-l>  <binding-li8t-2>>)) 
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C.2  Metapredicates 

The  meta-predicates  defined  for  EXPO  are  the  following: 

•  (BELOW-EXP-DEPTH-LIMIT  <node>) 

Tests  whether  a  node  is  below  a  user-defined  depth. 

•  (CURRENT-PS  <ps>) 

Used  to  set  a  context  for  the  activation  of  the  rule.  Tests  the  current  problem 
solving  context.  Two  contexts  are  currently  defined:  main  and  experimentation. 

•  (NODE-LEVEL  <node>  <level>) 

Returns  the  depth  of  a  node. 

•  (IS-CURRENT-STATE  <node>  <state>) 

Returns  the  current  state  at  that  node. 

•  (CURRENT-PLAN  <node>  <plan>) 

Returns  the  current  plan  at  that  node. 

•  (IS-TOO-LONG-PLAN  <plan>) 

Tests  whether  the  plan  is  longer  than  a  user-defined  length. 

•  (HAS-FEWER-CHANGES  <statel>  <state2>) 

Tests  whether  the  number  of  differences  with  the  initial  state  is  smaller  for  jstatel^ 
than  for  istate2i. 

•  (DOES-TOP-GOAL-CONCORD  <goal>) 

(NOT-DOES-TOP-GOAL-CONCORD  <goal> ) 

Test  whether  the  goal  is  the  same  as  any  pending  goals  in  the  main  plan. 

•  (DOES-TOP-PROTECTION-VIOLATION  <goal>) 

(NOT-DOES-TOP-PROTECTION-VIOLATION  <goal>) 

Test  whether  the  goal  clobbers  a  goal  previosly  achieved  for  the  main  plan. 
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•  (DOES-TOP-PREREQUISITE-VIQLATIQN  <goal>) 
(NOT-DOES-TOP-PREREQUISITE-VIOLATIQN  <goal>) 

Test  whether  the  goal  clobbers  a  predicate  needed  for  later  steps  of  the  main  plan. 

•  (HAS*OBJS-USED-VERY-HIGH-PROTECTION  <obj>) 

(NOT-HAS-OBJS-USED-VERY-HIGH-PROTECTION  <obj>) 

Test  whether  any  of  the  objects  is  of  a  very  high  protection  type. 

•  (NUM-OBJS-USED-VERY-HIGH-PRQTECTION  <objs>  <n>) 

Returns  how  many  objects  are  of  a  very  high  protection  type. 

•  (IS-MORE-RELIABLE  <opl>  <op2>) 

(NOT-RELIABLE  <opl>  <op2>) 

Test  whether  one  operator  is  more  reliable  than  another.  The  reliability  is  computed 
as  the  ratio  of  the  number  of  successful  and  the  number  of  failed  executions. 

•  (IS-REVERSIBLE  <op>) 

(NOT-IS-REVERSIBLE  <op» 

Test  whether  the  operator  is  reversible. 

•  (ARE- EFFECTS -OF  <op>  <eff ects-list>) 

Returns  the  effects  list  of  the  operator. 

The  meta  level  predicates  used  by  EXPO  that  are  provided  by  PRODIGY  are  the 
following: 

•  (CANDIDATE-MODE  <node>) 

Should  be  used  in  selecting,  rejecting,  and  preferring  nodes.  Tests  whether  a  node 
is  among  the  candidate  set  of  nodes  in  the  search  tree. 

•  (CURRENT-NODE  <node>) 

Tests  whether  <node>  has  been  chosen  as  current  node  in  this  decision  phase. 
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•  (CANDIDATE-OP  <node>  <op>) 

Tests  whether  <op>  is  a  member  of  the  relevant  operators  being  considered  at  the 
current  <node>. 

•  (CURRENT-OP  <node>  <op>) 

Tests  whether  <op>  is  the  current  jop^  for  the  current  goal  at  the  current  node. 

•  (CANDIDATE-BINDINGS  <bindings>  <node>) 

Tests  whether  <bindings>  is  a  member  of  the  default  set  of  candidate  bindings  for 
the  current  operator,  goal,  and  node. 

•  (KNOWN  <node>  <ezpression>) 

Tests  if  an  expression  is  true  in  the  current  state  at  the  node. 

•  (IS-EQUAL  <x>  <y>) 

(NOT-EQUAL  <x>  <y>) 

These  test  for  equality  and  inequality. 
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